2017-01-20 12 views
0

こんにちは私は構造体の配列を持つavroスキーマを持っており、avroとしてデータを保存することができます。データの取得中に配列の構造体<string <string、string >>のavroデータを取得できません

array<struct<string, string>> 

私は行に入ることができません。私が一列になっているすべてのデータ。ここ

は、私が「側面図exploを使用していますテーブル定義

CREATE EXTERNAL TABLE meterevents ROW FORMAT SERDE org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED as INPUTFORMAT org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' LOCATION '/......' TBLPROPERTIES ('avro.schema.url'='/..../schema.avsc');

ハイブテーブル構造

nametype    struct<nametypedescription:string,nametypename:string,nametypeauthority:struct<nametypeauthorityname:string,nametypeauthoritydescription:string>>  from deserializer 
 
names     struct<name:string,nametype:struct<nametypedescription:string,nametypename:string,nametypeauthority:struct<nametypeauthorityname:string,nametypeauthoritydescription:string>>> from deserializer 
 
enddeviceeventdetails struct<enddeviceeventdetailsname:string,enddeviceeventdetailsvalue:string>  from deserializer 
 
enddeviceevent   struct<mrid:string,createddatetime:string,issuerid:string,issuertrackingid:string,reason:string,severity:string,userid:string,asset:struct<assetmrid:string,assetnames:array<struct<name:string,nametype:struct<nametypedescription:string,nametypename:string,nametypeauthority:struct<nametypeauthorityname:string,nametypeauthoritydescription:string>>>>>,enddeviceeventdetails:array<struct<enddeviceeventdetailsname:string,enddeviceeventdetailsvalue:string>>,enddeviceeventtype:string,enddeviceeventnames:array<struct<name:string,nametype:struct<nametypedescription:string,nametypename:string,nametypeauthority:struct<nametypeauthorityname:string,nametypeauthoritydescription:string>>>>,status:struct<statusdatetime:string,statusreason:string,statusremark:string,statusvalue:string>,usagepoint:struct<usagepointmrid:string,usagepointnames:array<struct<name:string,nametype:struct<nametypedescription:string,nametypename:string,nametypeauthority:struct<nametypeauthorityname:string,nametypeauthoritydescription:string>>>>>>  from deserializer 
 
enddeviceeventtype  struct<enddeviceeventtypemrid:string,enddeviceeventtypedomain:string,enddeviceeventtypeeventoraction:string,enddeviceeventtypesubdomain:string,type:string,enddeviceeventtypenames:array<struct<name:string,nametype:struct<nametypedescription:string,nametypename:string,nametypeauthority:struct<nametypeauthorityname:string,nametypeauthoritydescription:string>>>>>  from deserializer 
 
header     struct<noun:string,context:string,verb:string,value:string,source:string,timestamp:string,correlationid:string,name:string,messageid:string,property:struct<propertyname:array<string>,propertyvalue:array<string>>> from deserializer 
 
payload     struct<enddeviceevents:array<struct<mrid:string,createddatetime:string,issuerid:string,issuertrackingid:string,reason:string,severity:string,userid:string,asset:struct<assetmrid:string,assetnames:array<struct<name:string,nametype:struct<nametypedescription:string,nametypename:string,nametypeauthority:struct<nametypeauthorityname:string,nametypeauthoritydescription:string>>>>>,enddeviceeventdetails:array<struct<enddeviceeventdetailsname:string,enddeviceeventdetailsvalue:string>>,enddeviceeventtype:string,enddeviceeventnames:array<struct<name:string,nametype:struct<nametypedescription:string,nametypename:string,nametypeauthority:struct<nametypeauthorityname:string,nametypeauthoritydescription:string>>>>,status:struct<statusdatetime:string,statusreason:string,statusremark:string,statusvalue:string>,usagepoint:struct<usagepointmrid:string,usagepointnames:array<struct<name:string,nametype:struct<nametypedescription:string,nametypename:string,nametypeauthority:struct<nametypeauthorityname:string,nametypeauthoritydescription:string>>>>>>>,enddeviceeventtype:array<struct<enddeviceeventtypemrid:string,enddeviceeventtypedomain:string,enddeviceeventtypeeventoraction:string,enddeviceeventtypesubdomain:string,type:string,enddeviceeventtypenames:array<struct<name:string,nametype:struct<nametypedescription:string,nametypename:string,nametypeauthority:struct<nametypeauthorityname:string,nametypeauthoritydescription:string>>>>>>>

ですデ」私のクエリ内のオプション

select eddetails.enddeviceeventdetailsname, \t eddetails.enddeviceeventdetailsvalue 
 
FROM meterevents_tmp 
 
LATERAL VIEW explode(payload.enddeviceevents.enddeviceeventdetails) ed AS eddetails 
 
limit 1;

が、それでも私は、単一の行のデータを取得しています。

enddeviceeventdetailsname  enddeviceeventdetailsvalue 
 
["EventSequenceNumber","EventSequenceNumber","EventSequenceNumber","EventSequenceNumber"]  ["683","684","685","686"

私はstackoverflowの上の他の質問読んでいる

enddeviceeventdetailsname  enddeviceeventdetailsvalue 
 
EventSequenceNumber    683 
 
EventSequenceNumber    684 
 
EventSequenceNumber    685 
 
EventSequenceNumber    686

として、このデータを持っているしたいと思います:Exploding Array of Struct using HiveQL

しかし、期待される出力を得ることができません。その中にハイブの外部テーブルがあり、私が "MAP KEYS TERMINATED BY"と "COLLECTION ITEMS BY BY BY TERMINATED BY"を指定できないserdeではなく、

助けてくれれば幸いです。

私はこの問題を解決することができたおかげで

答えて

0

---

array<struct<string,string>> 

は、親配列

array<struct<array<struct<string, string>>> 
の一部だったので、私は行の出力を得ることができませんでした

私のクエリを更新し、ネストされたexplodを使用しました

select eddetails.enddeviceeventdetailsname, eddetails.enddeviceeventdetailsvalue from (select ede.enddeviceeventdetails FROM meterevents_tmp LATERAL VIEW explode(payload.enddeviceevents) e AS ede) t LATERAL VIEW explode(t.enddeviceeventdetails) ed AS eddetails limit 10; 

私は、所望の出力ました -

enddeviceeventdetailsname  enddeviceeventdetailsvalue 
 
EventSequenceNumber  683 
 
EventSequenceNumber  684 
 
EventSequenceNumber  685 
 
EventSequenceNumber  686

関連する問題