2016-05-17 18 views
1

私はElasticsearchが新しく、TwitterデータをElasticsearchにインポートし、それに木場を実行することで、Twitterデータのデータ分析を試みています。私は、ElasticsearchにTwitterデータをインポートするときに立ち往生しています。どんな助けもありがとう!TwitterをElasticsearchにインポートする際のIllegal_argument_exception

ここに、エラーを生成するサンプル作業プログラムがあります。ここで

import json 
from elasticsearch import Elasticsearch 
es = Elasticsearch() 
data = json.loads(open("data.json").read()) 
es.index(index='tweets5', doc_type='tweets', id=data['id'], body=data) 

は誤りです:

Traceback (most recent call last): 
    File "elasticsearch_import_test.py", line 5, in <module> 
    es.index(index='tweets5', doc_type='tweets', id=data['id'], body=data) 
    File "/usr/local/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 69, in _wrapped 
    return func(*args, params=params, **kwargs) 
    File "/usr/local/lib/python2.7/site-packages/elasticsearch/client/__init__.py", line 279, in index 
    _make_path(index, doc_type, id), params=params, body=body) 
    File "/usr/local/lib/python2.7/site-packages/elasticsearch/transport.py", line 329, in perform_request 
    status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout) 
    File "/usr/local/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 109, in perform_request 
    self._raise_error(response.status, raw_data) 
    File "/usr/local/lib/python2.7/site-packages/elasticsearch/connection/base.py", line 108, in _raise_error 
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info) 
elasticsearch.exceptions.RequestError: TransportError(400, u'illegal_argument_exception', u'[Raza][127.0.0.1:9300][indices:data/write/index[p]]') 

はここで動作しない理由はあなたがインデックスにしようとしているということですたとえばTwitterのJSONファイル(data.json)

{ 
    "_id": { 
     "$oid": "570597358c68d71c16b3b722" 
    }, 
    "contributors": null, 
    "coordinates": null, 
    "created_at": "Wed Apr 06 23:09:41 +0000 2016", 
    "entities": { 
     "hashtags": [ 
      { 
       "indices": [ 
        68, 
        72 
       ], 
       "text": "dnd" 
      }, 
      { 
       "indices": [ 
        73, 
        79 
       ], 
       "text": "Nat20" 
      }, 
      { 
       "indices": [ 
        80, 
        93 
       ], 
       "text": "CriticalRole" 
      }, 
      { 
       "indices": [ 
        94, 
        103 
       ], 
       "text": "d20babes" 
      } 
     ], 
     "media": [ 
      { 
       "display_url": "pic.twitter.com/YQoxEuEAXV", 
       "expanded_url": "http://twitter.com/Zenttsilverwing/status/715953298076012545/photo/1", 
       "id": 715953292849754112, 
       "id_str": "715953292849754112", 
       "indices": [ 
        104, 
        127 
       ], 
       "media_url": "http://pbs.twimg.com/media/Ce-TugAUsAASZht.jpg", 
       "media_url_https": "https://pbs.twimg.com/media/Ce-TugAUsAASZht.jpg", 
       "sizes": { 
        "large": { 
         "h": 768, 
         "resize": "fit", 
         "w": 1024 
        }, 
        "medium": { 
         "h": 450, 
         "resize": "fit", 
         "w": 600 
        }, 
        "small": { 
         "h": 255, 
         "resize": "fit", 
         "w": 340 
        }, 
        "thumb": { 
         "h": 150, 
         "resize": "crop", 
         "w": 150 
        } 
       }, 
       "source_status_id": 715953298076012545, 
       "source_status_id_str": "715953298076012545", 
       "source_user_id": 2375847847, 
       "source_user_id_str": "2375847847", 
       "type": "photo", 
       "url": "https://shortened.url/YQoxEuEAXV" 
      } 
     ], 
     "symbols": [], 
     "urls": [ 
      { 
       "display_url": "darkcastlecollectibles.com", 
       "expanded_url": "http://www.darkcastlecollectibles.com/", 
       "indices": [ 
        44, 
        67 
       ], 
       "url": "https://shortened.url/SJgFTE0o8h" 
      } 
     ], 
     "user_mentions": [ 
      { 
       "id": 2375847847, 
       "id_str": "2375847847", 
       "indices": [ 
        3, 
        19 
       ], 
       "name": "Zack Chini", 
       "screen_name": "Zenttsilverwing" 
      } 
     ] 
    }, 
    "extended_entities": { 
     "media": [ 
      { 
       "display_url": "pic.twitter.com/YQoxEuEAXV", 
       "expanded_url": "http://twitter.com/Zenttsilverwing/status/715953298076012545/photo/1", 
       "id": 715953292849754112, 
       "id_str": "715953292849754112", 
       "indices": [ 
        104, 
        127 
       ], 
       "media_url": "http://pbs.twimg.com/media/Ce-TugAUsAASZht.jpg", 
       "media_url_https": "https://pbs.twimg.com/media/Ce-TugAUsAASZht.jpg", 
       "sizes": { 
        "large": { 
         "h": 768, 
         "resize": "fit", 
         "w": 1024 
        }, 
        "medium": { 
         "h": 450, 
         "resize": "fit", 
         "w": 600 
        }, 
        "small": { 
         "h": 255, 
         "resize": "fit", 
         "w": 340 
        }, 
        "thumb": { 
         "h": 150, 
         "resize": "crop", 
         "w": 150 
        } 
       }, 
       "source_status_id": 715953298076012545, 
       "source_status_id_str": "715953298076012545", 
       "source_user_id": 2375847847, 
       "source_user_id_str": "2375847847", 
       "type": "photo", 
       "url": "https://shortened.url/YQoxEuEAXV" 
      }, 
      { 
       "display_url": "pic.twitter.com/YQoxEuEAXV", 
       "expanded_url": "http://twitter.com/Zenttsilverwing/status/715953298076012545/photo/1", 
       "id": 715953295727009793, 
       "id_str": "715953295727009793", 
       "indices": [ 
        104, 
        127 
       ], 
       "media_url": "http://pbs.twimg.com/media/Ce-TuquUIAEsVn9.jpg", 
       "media_url_https": "https://pbs.twimg.com/media/Ce-TuquUIAEsVn9.jpg", 
       "sizes": { 
        "large": { 
         "h": 768, 
         "resize": "fit", 
         "w": 1024 
        }, 
        "medium": { 
         "h": 450, 
         "resize": "fit", 
         "w": 600 
        }, 
        "small": { 
         "h": 255, 
         "resize": "fit", 
         "w": 340 
        }, 
        "thumb": { 
         "h": 150, 
         "resize": "crop", 
         "w": 150 
        } 
       }, 
       "source_status_id": 715953298076012545, 
       "source_status_id_str": "715953298076012545", 
       "source_user_id": 2375847847, 
       "source_user_id_str": "2375847847", 
       "type": "photo", 
       "url": "https://shortened.url/YQoxEuEAXV" 
      } 
     ] 
    }, 
    "favorite_count": 0, 
    "favorited": false, 
    "filter_level": "low", 
    "geo": null, 
    "id": 717851801417031680, 
    "id_str": "717851801417031680", 
    "in_reply_to_screen_name": null, 
    "in_reply_to_status_id": null, 
    "in_reply_to_status_id_str": null, 
    "in_reply_to_user_id": null, 
    "in_reply_to_user_id_str": null, 
    "is_quote_status": false, 
    "lang": "en", 
    "place": null, 
    "possibly_sensitive": false, 
    "retweet_count": 0, 
    "retweeted": false, 
    "retweeted_status": { 
     "contributors": null, 
     "coordinates": null, 
     "created_at": "Fri Apr 01 17:25:42 +0000 2016", 
     "entities": { 
      "hashtags": [ 
       { 
        "indices": [ 
         47, 
         51 
        ], 
        "text": "dnd" 
       }, 
       { 
        "indices": [ 
         52, 
         58 
        ], 
        "text": "Nat20" 
       }, 
       { 
        "indices": [ 
         59, 
         72 
        ], 
        "text": "CriticalRole" 
       }, 
       { 
        "indices": [ 
         73, 
         82 
        ], 
        "text": "d20babes" 
       } 
      ], 
      "media": [ 
       { 
        "display_url": "pic.twitter.com/YQoxEuEAXV", 
        "expanded_url": "http://twitter.com/Zenttsilverwing/status/715953298076012545/photo/1", 
        "id": 715953292849754112, 
        "id_str": "715953292849754112", 
        "indices": [ 
         83, 
         106 
        ], 
        "media_url": "http://pbs.twimg.com/media/Ce-TugAUsAASZht.jpg", 
        "media_url_https": "https://pbs.twimg.com/media/Ce-TugAUsAASZht.jpg", 
        "sizes": { 
         "large": { 
          "h": 768, 
          "resize": "fit", 
          "w": 1024 
         }, 
         "medium": { 
          "h": 450, 
          "resize": "fit", 
          "w": 600 
         }, 
         "small": { 
          "h": 255, 
          "resize": "fit", 
          "w": 340 
         }, 
         "thumb": { 
          "h": 150, 
          "resize": "crop", 
          "w": 150 
         } 
        }, 
        "type": "photo", 
        "url": "https://shortened.url/YQoxEuEAXV" 
       } 
      ], 
      "symbols": [], 
      "urls": [ 
       { 
        "display_url": "darkcastlecollectibles.com", 
        "expanded_url": "http://www.darkcastlecollectibles.com/", 
        "indices": [ 
         23, 
         46 
        ], 
        "url": "https://shortened.url/SJgFTE0o8h" 
       } 
      ], 
      "user_mentions": [] 
     }, 
     "extended_entities": { 
      "media": [ 
       { 
        "display_url": "pic.twitter.com/YQoxEuEAXV", 
        "expanded_url": "http://twitter.com/Zenttsilverwing/status/715953298076012545/photo/1", 
        "id": 715953292849754112, 
        "id_str": "715953292849754112", 
        "indices": [ 
         83, 
         106 
        ], 
        "media_url": "http://pbs.twimg.com/media/Ce-TugAUsAASZht.jpg", 
        "media_url_https": "https://pbs.twimg.com/media/Ce-TugAUsAASZht.jpg", 
        "sizes": { 
         "large": { 
          "h": 768, 
          "resize": "fit", 
          "w": 1024 
         }, 
         "medium": { 
          "h": 450, 
          "resize": "fit", 
          "w": 600 
         }, 
         "small": { 
          "h": 255, 
          "resize": "fit", 
          "w": 340 
         }, 
         "thumb": { 
          "h": 150, 
          "resize": "crop", 
          "w": 150 
         } 
        }, 
        "type": "photo", 
        "url": "https://shortened.url/YQoxEuEAXV" 
       }, 
       { 
        "display_url": "pic.twitter.com/YQoxEuEAXV", 
        "expanded_url": "http://twitter.com/Zenttsilverwing/status/715953298076012545/photo/1", 
        "id": 715953295727009793, 
        "id_str": "715953295727009793", 
        "indices": [ 
         83, 
         106 
        ], 
        "media_url": "http://pbs.twimg.com/media/Ce-TuquUIAEsVn9.jpg", 
        "media_url_https": "https://pbs.twimg.com/media/Ce-TuquUIAEsVn9.jpg", 
        "sizes": { 
         "large": { 
          "h": 768, 
          "resize": "fit", 
          "w": 1024 
         }, 
         "medium": { 
          "h": 450, 
          "resize": "fit", 
          "w": 600 
         }, 
         "small": { 
          "h": 255, 
          "resize": "fit", 
          "w": 340 
         }, 
         "thumb": { 
          "h": 150, 
          "resize": "crop", 
          "w": 150 
         } 
        }, 
        "type": "photo", 
        "url": "https://shortened.url/YQoxEuEAXV" 
       } 
      ] 
     }, 
     "favorite_count": 5, 
     "favorited": false, 
     "filter_level": "low", 
     "geo": null, 
     "id": 715953298076012545, 
     "id_str": "715953298076012545", 
     "in_reply_to_screen_name": null, 
     "in_reply_to_status_id": null, 
     "in_reply_to_status_id_str": null, 
     "in_reply_to_user_id": null, 
     "in_reply_to_user_id_str": null, 
     "is_quote_status": false, 
     "lang": "en", 
     "place": null, 
     "possibly_sensitive": false, 
     "retweet_count": 1, 
     "retweeted": false, 
     "source": "<a href=\"http://twitter.com\" rel=\"nofollow\">Twitter Web Client</a>", 
     "text": "coins came in!! Thanks https://shortened.url/SJgFTE0o8h #dnd #Nat20 #CriticalRole #d20babes https://shortened.url/YQoxEuEAXV", 
     "truncated": false, 
     "user": { 
      "contributors_enabled": false, 
      "created_at": "Thu Mar 06 19:59:14 +0000 2014", 
      "default_profile": true, 
      "default_profile_image": false, 
      "description": "DM Geek Critter Con-man. I am here to like your art ^.^", 
      "favourites_count": 4990, 
      "follow_request_sent": null, 
      "followers_count": 57, 
      "following": null, 
      "friends_count": 183, 
      "geo_enabled": false, 
      "id": 2375847847, 
      "id_str": "2375847847", 
      "is_translator": false, 
      "lang": "en", 
      "listed_count": 7, 
      "location": "Flower Mound, TX", 
      "name": "Zack Chini", 
      "notifications": null, 
      "profile_background_color": "C0DEED", 
      "profile_background_image_url": "http://abs.twimg.com/images/themes/theme1/bg.png", 
      "profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme1/bg.png", 
      "profile_background_tile": false, 
      "profile_banner_url": "https://pbs.twimg.com/profile_banners/2375847847/1430928759", 
      "profile_image_url": "http://pbs.twimg.com/profile_images/708816622358663168/mNF4Ysr5_normal.jpg", 
      "profile_image_url_https": "https://pbs.twimg.com/profile_images/708816622358663168/mNF4Ysr5_normal.jpg", 
      "profile_link_color": "0084B4", 
      "profile_sidebar_border_color": "C0DEED", 
      "profile_sidebar_fill_color": "DDEEF6", 
      "profile_text_color": "333333", 
      "profile_use_background_image": true, 
      "protected": false, 
      "screen_name": "Zenttsilverwing", 
      "statuses_count": 551, 
      "time_zone": null, 
      "url": null, 
      "utc_offset": null, 
      "verified": false 
     } 
    }, 
    "source": "<a href=\"http://twitter.com\" rel=\"nofollow\">Twitter Web Client</a>", 
    "text": "RT @Zenttsilverwing: coins came in!! Thanks https://shortened.url/SJgFTE0o8h #dnd #Nat20 #CriticalRole #d20babes https://shortened.url/YQoxEuEAXV", 
    "timestamp_ms": "1459984181156", 
    "truncated": false, 
    "user": { 
     "contributors_enabled": false, 
     "created_at": "Tue Feb 10 04:31:18 +0000 2009", 
     "default_profile": false, 
     "default_profile_image": false, 
     "description": "I use Twitter to primarily retweet Critter artwork of Critical Role and their own creations. I maintain a list of all the Critter artists I've come across.", 
     "favourites_count": 17586, 
     "follow_request_sent": null, 
     "followers_count": 318, 
     "following": null, 
     "friends_count": 651, 
     "geo_enabled": true, 
     "id": 20491914, 
     "id_str": "20491914", 
     "is_translator": false, 
     "lang": "en", 
     "listed_count": 33, 
     "location": "SanDiego, CA", 
     "name": "UnknownOutrider", 
     "notifications": null, 
     "profile_background_color": "EDECE9", 
     "profile_background_image_url": "http://abs.twimg.com/images/themes/theme3/bg.gif", 
     "profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme3/bg.gif", 
     "profile_background_tile": false, 
     "profile_image_url": "http://pbs.twimg.com/profile_images/224346493/cartoon_dragon_tattoo_designs_normal.jpg", 
     "profile_image_url_https": "https://pbs.twimg.com/profile_images/224346493/cartoon_dragon_tattoo_designs_normal.jpg", 
     "profile_link_color": "088253", 
     "profile_sidebar_border_color": "D3D2CF", 
     "profile_sidebar_fill_color": "E3E2DE", 
     "profile_text_color": "634047", 
     "profile_use_background_image": true, 
     "protected": false, 
     "screen_name": "UnknownOutrider", 
     "statuses_count": 12760, 
     "time_zone": "Pacific Time (US & Canada)", 
     "url": null, 
     "utc_offset": -25200, 
     "verified": false 
    } 
} 

答えて

1

です既定のフィールドとしてすでに存在する_idという名前のフィールドを持つ文書。そのフィールドを削除するか、フィールド名を変更してください:

import json 
from elasticsearch import Elasticsearch 
es = Elasticsearch() 
data = json.loads(open("data.json").read()) 
# data['id_'] = data['_id'] <= You can change _id as id_ 
del data['_id'] 
es.index(index='tweets5', doc_type='tweets', id=data['id'], body=data) 
+0

それでした。ありがとうございました! –

関連する問題