変換テキストファイルSequentialFileOutput形式

に私はsequentialFileoutputFormat形式にテキストファイルを変換しようとしていますが、私はエラーメッセージを持っている：変換テキストファイルSequentialFileOutput形式

にjava.io.IOException間違ったキークラス/home/mmrao/test.txtですないクラス org.apache.hadoop.io.LogWritable

Mapper 
public class SequenceFi[enter image description here][1]leMapper extends Mapper<NullWritable, BytesWritable, Text, BytesWritable> { 
    private Text filenameKey; 

    @Override 
    protected void setup(Context context) throws IOException, InterruptedException { 
     InputSplit split = context.getInputSplit(); 

     Path path = ((FileSplit) split).getPath(); 
     // filenameKey = new LongWritable(); 
     filenameKey = new Text(path.toString()); 
    } 

    @Override 
    protected void map(NullWritable key, BytesWritable value, Context context) throws IOException, InterruptedException { 
     context.write(filenameKey, value); 
    } 
} 


WholeFileInputFormat: 
public class WholeFileInputFormat extends FileInputFormat<NullWritable, BytesWritable> { 
    @Override 
    protected boolean isSplitable(JobContext context, Path file) { 
     return false; 
    } 

    @Override 
    public RecordReader<NullWritable, BytesWritable> createRecordReader(InputSplit split, TaskAttemptContext context) 
      throws IOException, InterruptedException { 
     WholeFileRecordReader reader = new WholeFileRecordReader(); 
     reader.initialize(split, context); 
     return reader; 
    } 
} 

WholeFileRecordReader:: 

public class WholeFileRecordReader extends RecordReader<NullWritable, BytesWritable> { 
    private FileSplit fileSplit; 
    private Configuration conf; 
    private BytesWritable value = new BytesWritable(); 
    private boolean processed = false; 

    @Override 
    public void initialize(InputSplit split, TaskAttemptContext context) throws IOException, InterruptedException { 
     this.fileSplit = (FileSplit) split; 
     this.conf = context.getConfiguration(); 
    } 

    @Override 
    public boolean nextKeyValue() throws IOException, InterruptedException { 
     if (!processed) { 
      byte[] contents = new byte[(int) fileSplit.getLength()]; 
      Path file = fileSplit.getPath(); 
      FileSystem fs = file.getFileSystem(conf); 
      FSDataInputStream in = null; 
      try { 
       in = fs.open(file); 
       IOUtils.readFully(in, contents, 0, contents.length); 
       value.set(contents, 0, contents.length); 
      } finally { 
       IOUtils.closeStream(in); 
      } 
      processed = true; 
      return true; 
     } 
     return false; 
    } 

    @Override 
    public NullWritable getCurrentKey() throws IOException, InterruptedException { 
     return NullWritable.get(); 
    } 

    @Override 
    public BytesWritable getCurrentValue() throws IOException, InterruptedException { 
     return value; 
    } 

    @Override 
    public float getProgress() throws IOException { 
     return processed ? 1.0f : 0.0f; 
    } 

    @Override 
    public void close() throws IOException { 
     // do nothing 
    } 
} 

DriverClass: 

public class SmallFilesToSequenceFileConverter extends Configured implements Tool { 

    /** 
    * @param args 
    * @throws Exception 
    */ 
    public static void main(String[] args) throws Exception { 
     System.exit(ToolRunner.run(new Configuration(), new SmallFilesToSequenceFileConverter(), args)); 
    } 

    public int run(String[] args) throws Exception { 
     // TODO Auto-generated method stub 
     Configuration conf = getConf(); 
     @SuppressWarnings("deprecation") 
     Job job = new Job(conf); 
     job.setJobName("SequenceFile "); 
     job.setJarByClass(SmallFilesToSequenceFileConverter.class); 
     FileInputFormat.setInputDirRecursive(job, true); 
     FileInputFormat.addInputPath(job, new Path(args[0])); 
     job.setInputFormatClass(WholeFileInputFormat.class); 
     // job.setOutputFormatClass(SequenceFileOutputFormat.class); 
     job.setMapperClass(SequenceFileMapper.class); 
     job.setMapOutputKeyClass(Text.class); 
     job.setMapOutputValueClass(BytesWritable.class); 
     // job.setReducerClass(IntSumReducer.class); 
     // job.setNumReduceTasks(0); 
     FileOutputFormat.setOutputPath(job, new Path(args[1])); 
     job.submit(); 
     job.waitForCompletion(true); 

     return 0; 
    } 
}

注：入力ファイルは、コマンドライン入力と出力

に与えるHDFSの場所にあります

クエリ：

hadoop jar seq.jar package.driverclass ip op

Eroorログイン::::::::::::::::::::

[email protected]:~$ yarn jar /home/mmrao/Downloads/seq.jar seq.SmallFilesToSequenceFileConverter /seq/files /seqout 

    16/06/25 10:08:43 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 
    16/06/25 10:08:45 INFO input.FileInputFormat: Total input paths to process : 2 
    16/06/25 10:08:45 INFO mapreduce.JobSubmitter: number of splits:2 
    16/06/25 10:08:46 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1466829146657_0001 
    16/06/25 10:08:47 INFO impl.YarnClientImpl: Submitted application application_1466829146657_0001 
    16/06/25 10:08:47 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1466829146657_0001/ 
    16/06/25 10:08:47 INFO mapreduce.Job: Running job: job_1466829146657_0001 
    16/06/25 10:08:57 INFO mapreduce.Job: Job job_1466829146657_0001 running in uber mode : false 
    16/06/25 10:08:57 INFO mapreduce.Job: map 0% reduce 0% 
    16/06/25 10:09:09 INFO mapreduce.Job: map 50% reduce 0% 
    16/06/25 10:09:10 INFO mapreduce.Job: map 100% reduce 0% 
    16/06/25 10:09:17 INFO mapreduce.Job: Task Id : attempt_1466829146657_0001_r_000000_0, Status : FAILED 
    Error: java.io.IOException: wrong key class: org.apache.hadoop.io.Text is not class org.apache.hadoop.io.LongWritable 
     at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1308) 
     at org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat$1.write(SequenceFileOutputFormat.java:83) 
     at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:558) 
     at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89) 
     at org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.write(WrappedReducer.java:105) 
     at org.apache.hadoop.mapreduce.Reducer.reduce(Reducer.java:150) 
     at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) 
     at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) 
     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) 
     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) 
     at java.security.AccessController.doPrivileged(Native Method) 
     at javax.security.auth.Subject.doAs(Subject.java:415) 
     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) 
     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

出典

2016-06-21 MMRaonaidu

どのようなことが起こりますか？あなたは少しコードを減らす必要があるかもしれない、あなたはそこにたくさん置いた。 [詳細についてはここをクリックしてください] – Draken

詳細なエラーメッセージを提供する必要があります。 –

私は、クエリを書くことで間違いをやっていると思います。あなたが正確なクエリを提供できるなら、私はあなたにもっと特定の答えを与えることができます。

しかし、あなたのエラーによると、誤ったクエリが原因であるはずです。

あなたのクエリは、この Hadoopのジャー

例のようにする必要があります - > Hadoopのジャー/home/training/Desktop/file.jar DriverFile /user/file/abc.txt /ユーザー/ファイル/出力

ここで

、 DriverFile - >ジャー場所

/user/file/abc.txt - >これはmainメソッド

/home/training/Desktop/file.jarを含む私のJavaクラスです→フルアドレスS-のファイル・ツー・プロセス（このファイルは、あなたのHDFSにする必要があります）

/ユーザー/ファイル/出力 - >出力ディレクトリ（それが一意である必要があります）

まだあなたが後にエラーがある場合これを適用する。あなたのログのスクリーンショットを取ってここに投稿してください。

出典

2016-06-23 09:23:07

はいSiddhartha jain、私はjarの絶対パスと入力と出力のパスを指定しました..この問題を調べてください – MMRaonaidu

This is Error Log for the Question :::: 

[email protected]:~$ yarn jar /home/mmrao/Downloads/seq.jar seq.SmallFilesToSequenceFileConverter /seq/files /seqout 
16/06/25 10:08:43 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 
16/06/25 10:08:45 INFO input.FileInputFormat: Total input paths to process : 2 
16/06/25 10:08:45 INFO mapreduce.JobSubmitter: number of splits:2 
16/06/25 10:08:46 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1466829146657_0001 
16/06/25 10:08:47 INFO impl.YarnClientImpl: Submitted application application_1466829146657_0001 
16/06/25 10:08:47 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1466829146657_0001/ 
16/06/25 10:08:47 INFO mapreduce.Job: Running job: job_1466829146657_0001 
16/06/25 10:08:57 INFO mapreduce.Job: Job job_1466829146657_0001 running in uber mode : false 
16/06/25 10:08:57 INFO mapreduce.Job: map 0% reduce 0% 
16/06/25 10:09:09 INFO mapreduce.Job: map 50% reduce 0% 
16/06/25 10:09:10 INFO mapreduce.Job: map 100% reduce 0% 
16/06/25 10:09:17 INFO mapreduce.Job: Task Id : attempt_1466829146657_0001_r_000000_0, Status : FAILED 
Error: java.io.IOException: wrong key class: org.apache.hadoop.io.Text is not class org.apache.hadoop.io.LongWritable 
    at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1308) 
    at org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat$1.write(SequenceFileOutputFormat.java:83) 
    at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:558) 
    at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89) 
    at org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.write(WrappedReducer.java:105) 
    at org.apache.hadoop.mapreduce.Reducer.reduce(Reducer.java:150) 
    at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) 
    at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) 
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:415) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

出典

2016-06-28 08:16:12 MMRaonaidu

変換テキストファイルSequentialFileOutput形式

答えて

関連する問題