2017-11-19 8 views
0

mysqlにmovie、moviegenre、genreという3つのテーブルがあります。私はsqoop自由形式のクエリ使用してHDFSにインポートしようとすると:sqoopインポートクエリで重複した名前エラーが発生する

sqoop import --connect jdbc:mysql://localhost/movielens --username user --password *** 
--query 'select m.id as id,m.name as movie_name,m.year as year, g.name as genre_name 
from movie m join moviegenre mg on m.id = mg.movieid join genre g on g.id = mg.genreid 
WHERE $CONDITIONS' --split-by m.id --target-dir /user/sqoop/moviegenre 

を、エラーがスローされます。

Imported Failed: Duplicate Column identifier specified: 'name'.

私はMySQLで同じクエリを書くとき、それは私に何私である出力を与えます欲しい:

id  movie_name   year genre_name 
1  Toy Story   1995 Animation 
2  Jumanji   1995 Adventure 
..  .......   .... ........ 

私は、このリンクをたどっと答えたとおりにやった:Imported Failed: Duplicate Column identifier specified (sqoop)が、それはどちらか助けていないようでした。表中の

フィールドは以下の通りです: 映画= ID、名前、年

ジャンル= ID、名前

moviegenre = MOVIEID、genreid

私のクエリに間違いを指摘してください。

答えて

0

SQOOPコマンドにエラーはありません。私はcloudera quick start VMでテーブルを作成し、SQOOPインポートを実行しました。それはうまくいって結果を生み出しました。エイリアスを追加する前にコマンドを実行した可能性があります。もう一つの違いは、私はコマンドをフォーマットしたことです。あなたがそれを必要とする場合には

は、以下の私が

MySQLコマンド

mysql> use retail_db; 
mysql> create table movie (id integer, name varchar(100), year integer); 
mysql> insert into movie values (1, 'TEST', 2016); 
mysql> create table genre (id integer, name varchar(100)); 
mysql> insert into genre values (1, 'TEST'); 
mysql> create table moviegenre (movieid integer, genreid integer); 
mysql> insert into moviegenre values (1, 1); 

SQOOPコマンドを実行したものです

sqoop import \ 
--connect jdbc:mysql://localhost/retail_db \ 
--username root \ 
--password cloudera \ 
--query 'select m.id as id,m.name as movie_name,m.year as year, g.name as genre_name from movie m join moviegenre mg on m.id = mg.movieid join genre g on g.id = mg.genreid WHERE $CONDITIONS' --split-by m.id --target-dir /user/cloudera/moviegenre 

SQOOP標準出力

[[email protected] ~]$ sqoop import \ 
> --connect jdbc:mysql://localhost/retail_db \ 
> --username root \ 
> --password cloudera \ 
> --query 'select m.id as id,m.name as movie_name,m.year as year, g.name as genre_name from movie m join moviegenre mg on m.id = mg.movieid join genre g on g.id = mg.genreid WHERE $CONDITIONS' --split-by m.id --target-dir /user/cloudera/moviegenre 
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail. 
Please set $ACCUMULO_HOME to the root of your Accumulo installation. 
17/11/19 13:08:01 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.10.0 
17/11/19 13:08:01 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 
17/11/19 13:08:01 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 
17/11/19 13:08:01 INFO tool.CodeGenTool: Beginning code generation 
17/11/19 13:08:02 INFO manager.SqlManager: Executing SQL statement: select m.id as id,m.name as movie_name,m.year as year, g.name as genre_name from movie m join moviegenre mg on m.id = mg.movieid join genre g on g.id = mg.genreid WHERE (1 = 0) 
17/11/19 13:08:02 INFO manager.SqlManager: Executing SQL statement: select m.id as id,m.name as movie_name,m.year as year, g.name as genre_name from movie m join moviegenre mg on m.id = mg.movieid join genre g on g.id = mg.genreid WHERE (1 = 0) 
17/11/19 13:08:02 INFO manager.SqlManager: Executing SQL statement: select m.id as id,m.name as movie_name,m.year as year, g.name as genre_name from movie m join moviegenre mg on m.id = mg.movieid join genre g on g.id = mg.genreid WHERE (1 = 0) 
17/11/19 13:08:02 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce 
Note: /tmp/sqoop-cloudera/compile/3b35f51458e53da94c6852dcfc0b904a/QueryResult.java uses or overrides a deprecated API. 
Note: Recompile with -Xlint:deprecation for details. 
17/11/19 13:08:04 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/3b35f51458e53da94c6852dcfc0b904a/QueryResult.jar 
17/11/19 13:08:04 INFO mapreduce.ImportJobBase: Beginning query import. 
17/11/19 13:08:04 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address 
17/11/19 13:08:04 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 
17/11/19 13:08:05 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 
17/11/19 13:08:05 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 
17/11/19 13:08:08 WARN hdfs.DFSClient: Caught exception 
java.lang.InterruptedException 
    at java.lang.Object.wait(Native Method) 
    at java.lang.Thread.join(Thread.java:1281) 
    at java.lang.Thread.join(Thread.java:1355) 
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:951) 
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:689) 
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:878) 
17/11/19 13:08:08 INFO db.DBInputFormat: Using read commited transaction isolation 
17/11/19 13:08:08 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(t1.id), MAX(t1.id) FROM (select m.id as id,m.name as movie_name,m.year as year, g.name as genre_name from movie m join moviegenre mg on m.id = mg.movieid join genre g on g.id = mg.genreid WHERE (1 = 1)) AS t1 
17/11/19 13:08:08 INFO db.IntegerSplitter: Split size: 0; Num splits: 4 from: 1 to: 1 
17/11/19 13:08:08 INFO mapreduce.JobSubmitter: number of splits:1 
17/11/19 13:08:08 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1510865312807_0011 
17/11/19 13:08:09 INFO impl.YarnClientImpl: Submitted application application_1510865312807_0011 
17/11/19 13:08:09 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1510865312807_0011/ 
17/11/19 13:08:09 INFO mapreduce.Job: Running job: job_1510865312807_0011 
17/11/19 13:08:33 INFO mapreduce.Job: Job job_1510865312807_0011 running in uber mode : false 
17/11/19 13:08:33 INFO mapreduce.Job: map 0% reduce 0% 
17/11/19 13:08:52 INFO mapreduce.Job: map 100% reduce 0% 
17/11/19 13:08:52 INFO mapreduce.Job: Job job_1510865312807_0011 completed successfully 
17/11/19 13:08:52 INFO mapreduce.Job: Counters: 30 
    File System Counters 
     FILE: Number of bytes read=0 
     FILE: Number of bytes written=148008 
     FILE: Number of read operations=0 
     FILE: Number of large read operations=0 
     FILE: Number of write operations=0 
     HDFS: Number of bytes read=99 
     HDFS: Number of bytes written=17 
     HDFS: Number of read operations=4 
     HDFS: Number of large read operations=0 
     HDFS: Number of write operations=2 
    Job Counters 
     Launched map tasks=1 
     Other local map tasks=1 
     Total time spent by all maps in occupied slots (ms)=16886 
     Total time spent by all reduces in occupied slots (ms)=0 
     Total time spent by all map tasks (ms)=16886 
     Total vcore-seconds taken by all map tasks=16886 
     Total megabyte-seconds taken by all map tasks=17291264 
    Map-Reduce Framework 
     Map input records=1 
     Map output records=1 
     Input split bytes=99 
     Spilled Records=0 
     Failed Shuffles=0 
     Merged Map outputs=0 
     GC time elapsed (ms)=702 
     CPU time spent (ms)=2330 
     Physical memory (bytes) snapshot=231743488 
     Virtual memory (bytes) snapshot=1567064064 
     Total committed heap usage (bytes)=221249536 
    File Input Format Counters 
     Bytes Read=0 
    File Output Format Counters 
     Bytes Written=17 
17/11/19 13:08:52 INFO mapreduce.ImportJobBase: Transferred 17 bytes in 46.7631 seconds (0.3635 bytes/sec) 
17/11/19 13:08:52 INFO mapreduce.ImportJobBase: Retrieved 1 records. 
+0

あなたの答えを見た後、私は別のVMでそれを試しました。それはpercetlyで働いていました。見えますが、sqoopは正しく動作していませんでした。 –

関連する問題