2017-03-03 16 views
3

S3からJSONデータをインポートしようとしましたが、クエリを実行した後、出力をJSON形式でS3に再度エクスポートします。しかし、「org.apache.hadoop.hive.serde2.SerDeException:java.io.IOException:開始トークンが期待された場所に見つかりません」というメッセージが表示されます。 EMRクラスターのハイブステップでエラーが発生しました。問題の内容を理解するために、HiveスクリプトとJSONデータを単純化しますが、同じエラーが発生し続けます。どうすればこの問題を解決できますか?JsonSerDeの使用中にトークンが見つかりませんでした。

クラスタ構成:

Release: emr-5.3.1

Hive version: 2.1.1

Hadoop distribution: Amazon 2.7.3

Service Role: EMR_DefaultRole

MasterInstanceType: m4.large

簡体JSONデータの内容:

[{"MyID":"FOO123","MyField":"FOO"},{"MyID":"BAR123","MyField":"BAR"}] 

ハイブスクリプト:

DROP TABLE IF EXISTS SOURCE; 
DROP TABLE IF EXISTS DESTINATION; 

CREATE EXTERNAL TABLE SOURCE(MyID STRING, MyField STRING) 
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' 
LOCATION 's3://myPath/subPath/'; 

CREATE EXTERNAL TABLE DESTINATION(MyID STRING, MyField STRING)          
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' 
LOCATION 's3://anotherPath/subPath/'; 

INSERT OVERWRITE TABLE DESTINATION SELECT MyID, MyField FROM SOURCE; 

そして、ここでは、スタックトレースです:

Vertex failed, vertexName=Map 4, vertexId=vertex_1278452616863_0001_1_00, diagnostics=[Task failed, taskId=task_1278452616863, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task (failure) : attempt_1278452616863:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable [{"MyID":"FOO123","MyField":"FOO"},{"MyID":"BAR123","MyField":"BAR"}] at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable [{"MyID":"FOO123","MyField":"FOO"},{"MyID":"BAR123","MyField":"BAR"}] at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:95) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:70) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:383) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:185) ... 14 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable [{"MyID":"FOO123","MyField":"FOO"},{"MyID":"BAR123","MyField":"BAR"}] at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:86) ... 17 more Caused by: org.apache.hadoop.hive.serde2.SerDeException: java.io.IOException: Start token not found where expected at org.apache.hive.hcatalog.data.JsonSerDe.deserialize(JsonSerDe.java:183) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:128) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:92) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:488) ... 18 more Caused by: java.io.IOException: Start token not found where expected at org.apache.hive.hcatalog.data.JsonSerDe.deserialize(JsonSerDe.java:169) ... 21 more

ありがとうございました。

答えて

2

JSONは配列([

+0

{で始めるといけませんが、あなたが私の一日保存された、ありがとうございました。 – zretscen

関連する問題