2017-08-08 13 views
1

Pandasを使用して以下のようにJSON文字列をCSVに変換しようとしています。私に関心のある何Pandasを使用してJSONをCSVに変換する

{ 
    "count": 8, 
    "facets": [], 
    "results": [ 
     { 
     "protocol": "DWC_ARCHIVE", 
     "taxonKey": 4332928, 
     "family": "Diaptomidae", 
     "institutionCode": "MNHN", 
     "lastInterpreted": "2017-05-17T13:20:23.744+0000", 
     "speciesKey": 4332928, 
     "gbifID": "694182141", 
     "identifiedBy": "Dussart B.", 
     "lastParsed": "2017-05-17T13:19:47.003+0000", 
     "phylum": "Arthropoda", 
     "orderKey": 679, 
     "facts": [], 
     "species": "Diaptomus kenitraensis", 
     "issues": [], 
     "occurrenceID": "http://coldb.mnhn.fr/catalognumber/mnhn/iu/2010-6707", 
     "countryCode": null, 
     "basisOfRecord": "PRESERVED_SPECIMEN", 
     "relations": [], 
     "classKey": 203, 
     "catalogNumber": "2010-6707", 
     "scientificName": "Diaptomus kenitraensis Kiefer, 1926", 
     "taxonRank": "SPECIES", 
     "familyKey": 9038, 
     "kingdom": "Animalia", 
     "publishingOrgKey": "2cd829bb-b713-433d-99cf-64bef11e5b3e", 
     "collectionCode": "IU", 
     "kingdomKey": 1, 
     "genusKey": 2114554, 
     "key": 694182141, 
     "phylumKey": 54, 
     "genericName": "Diaptomus", 
     "class": "Maxillopoda", 
     "crawlId": 116, 
     "individualCount": 1, 
     "publishingCountry": "FR", 
     "identifier": "http://coldb.mnhn.fr/catalognumber/mnhn/iu/2010-6707", 
     "lastCrawled": "2017-08-03T14:05:37.635+0000", 
     "license": "http://creativecommons.org/licenses/by/4.0/legalcode", 
     "datasetKey": "da6a07ed-9eee-460d-9448-910f542c1a7b", 
     "specificEpithet": "kenitraensis", 
     "identifiers": [], 
     "modified": "2015-06-19T19:23:01.000+0000", 
     "extensions": {}, 
     "genus": "Diaptomus", 
     "order": "Calanoida" 
     }, 
     { 
     "protocol": "DWC_ARCHIVE", 
     "taxonKey": 4332928, 
     "family": "Diaptomidae", 
     "institutionCode": "MNHN", 
     "lastInterpreted": "2017-05-17T13:19:51.210+0000", 
     "speciesKey": 4332928, 
     "gbifID": "440012453", 
     "identifiedBy": "Dussart B.", 
     "lastParsed": "2017-05-17T13:19:31.422+0000", 
     "phylum": "Arthropoda", 
     "orderKey": 679, 
     "facts": [], 
     "species": "Diaptomus kenitraensis", 
     "issues": [], 
     "occurrenceID": "http://coldb.mnhn.fr/catalognumber/mnhn/iu/2007-1537", 
     "countryCode": null, 
     "basisOfRecord": "PRESERVED_SPECIMEN", 
     "relations": [], 
     "classKey": 203, 
     "catalogNumber": "2007-1537", 
     "scientificName": "Diaptomus kenitraensis Kiefer, 1926", 
     "taxonRank": "SPECIES", 
     "familyKey": 9038, 
     "kingdom": "Animalia", 
     "publishingOrgKey": "2cd829bb-b713-433d-99cf-64bef11e5b3e", 
     "collectionCode": "IU", 
     "kingdomKey": 1, 
     "genusKey": 2114554, 
     "key": 440012453, 
     "phylumKey": 54, 
     "genericName": "Diaptomus", 
     "class": "Maxillopoda", 
     "crawlId": 116, 
     "individualCount": 8, 
     "publishingCountry": "FR", 
     "identifier": "http://coldb.mnhn.fr/catalognumber/mnhn/iu/2007-1537", 
     "lastCrawled": "2017-08-03T14:05:30.146+0000", 
     "license": "http://creativecommons.org/licenses/by/4.0/legalcode", 
     "datasetKey": "da6a07ed-9eee-460d-9448-910f542c1a7b", 
     "specificEpithet": "kenitraensis", 
     "identifiers": [], 
     "modified": "2015-06-19T19:23:00.000+0000", 
     "extensions": {}, 
     "genus": "Diaptomus", 
     "order": "Calanoida" 
     } 
    ], 
    "endOfRecords": false, 
    "limit": 2, 
    "offset": 0 
} 

は「結果」の部分である:

は、ここで(それはまた、ファイルから読み取ることができる)私の例の文字列です。パンダを使用して

、私はこの試みた:

df = pd.read_json(json_string) 
df.to_csv("output.csv", index=False, sep='\t', encoding="utf-8") 

をしかし、私は下のエラーを得た:への試みで、How can I convert JSON to CSV?:私もここから、より詳細な提案のほとんどを試してみました

File "C:\Python27\lib\site-packages\pandas\io\json.py", line 281, in read_json 
    date_unit).parse() 
    File "C:\Python27\lib\site-packages\pandas\io\json.py", line 349, in parse 
    self._parse_no_numpy() 
    File "C:\Python27\lib\site-packages\pandas\io\json.py", line 566, in _parse_no_numpy 
    loads(json, precise_float=self.precise_float), dtype=None) 
TypeError: Expected String or Unicode 

上記のjsonを直接CSV(Pandasをバイパス)に変換しますが、成功することはありません。

誰でも私にヒントを与えることができますか?あなたが提供できる援助に感謝します。

敬具、

+0

多分JSONでリストや辞書がエラーを引き起こしますか? :)もしそれらが常に空であれば、それらを単に取り除くことを考えることができます。 – Roelant

答えて

2

あなたはjson_normalizeを使用することができます。

import json 
from pandas.io.json import json_normalize 

with open('file.json') as data_file:  
    data = json.load(data_file) 

df = json_normalize(data, 'results') 
print (df) 
     basisOfRecord catalogNumber  class classKey collectionCode \ 
0 PRESERVED_SPECIMEN  2010-6707 Maxillopoda  203    IU 
1 PRESERVED_SPECIMEN  2007-1537 Maxillopoda  203    IU 

    countryCode crawlId       datasetKey extensions facts \ 
0  None  116 da6a07ed-9eee-460d-9448-910f542c1a7b   {} [] 
1  None  116 da6a07ed-9eee-460d-9448-910f542c1a7b   {} [] 

    ...   protocol publishingCountry \ 
0 ...  DWC_ARCHIVE     FR 
1 ...  DWC_ARCHIVE     FR 

         publishingOrgKey relations \ 
0 2cd829bb-b713-433d-99cf-64bef11e5b3e  [] 
1 2cd829bb-b713-433d-99cf-64bef11e5b3e  [] 

         scientificName     species speciesKey \ 
0 Diaptomus kenitraensis Kiefer, 1926 Diaptomus kenitraensis 4332928 
1 Diaptomus kenitraensis Kiefer, 1926 Diaptomus kenitraensis 4332928 

    specificEpithet taxonKey taxonRank 
0 kenitraensis 4332928 SPECIES 
1 kenitraensis 4332928 SPECIES 

[2 rows x 45 columns] 
+1

はい、ありがとうございます。実際、私はjson_normalizeの試行を検討していましたが、構文が間違っていました。 – maurobio

関連する問題