2017-11-28 8 views
-3

もっと実践的な体験を得るために、プロジェクトの単語数を試したかったのです。
ここに私が持っているサンプルデータがあります。pythonでmapreduce用のプログラムを試していて、助けが必要

国際連合(UN)は、国際協力を促進するために1945年10月24日に設立された政府間組織 です。 A 無効なリーグオブリジョンの代わりに、団体 は第二次世界大戦後に別のそのような紛争を防ぐために作成されました。

[...]

と私はまた、Pythonの2

[[email protected] Desktop]# python RatingsBreakdown.py UN.txt 
Traceback (most recent call last): 
    File "RatingsBreakdown.py", line 1, in <module> 
    from mrjob.job import MRJob 
    File "/usr/lib/python2.6/site-packages/mrjob/job.py", line 1106 
    for k, v in unfiltered_jobconf.items() if v is not None 
    ^
SyntaxError: invalid syntax 

と、次のエラーを取得しています私の結果

from mrjob.job import MRJob 

from mrjob.step import MRStep 



class MovieRatings(MRJob): 

    def steps(self): 

     return [ 

      MRStep(mapper=self.mapper_get_ratings, 

        reducer=self.reducer_count_ratings), 

    ] 



    def mapper_get_ratings(self, _, line): 

     (word) = line.split(' ') 

     yield word, 1 



    def reducer_count_ratings(self, key, values): 

     yield Key, sum(values) 


if __name__ == '__main__': 

    MovieRatings.run() 

を取得するには、次のpythonコードを使用Python 3

[[email protected] Desktop]# python3 RatingsBreakdown.py UN.txt 
No configs found; falling back on auto-configuration 
No configs specified for inline runner 
Running step 1 of 2... 
Creating temp directory /tmp/RatingsBreakdown.training.20171128.083536.602598 
Error while reading from /tmp/RatingsBreakdown.training.20171128.083536.602598/step/000/mapper/00000/input: 
Traceback (most recent call last): 
    File "RatingsBreakdown.py", line 25, in <module> 
    RatingsBreakdown.run() 
    File "/usr/lib/python3.4/site-packages/mrjob/job.py", line 424, in run 
    mr_job.execute() 
    File "/usr/lib/python3.4/site-packages/mrjob/job.py", line 445, in execute 
    super(MRJob, self).execute() 
    File "/usr/lib/python3.4/site-packages/mrjob/launch.py", line 185, in execute 
    self.run_job() 
    File "/usr/lib/python3.4/site-packages/mrjob/launch.py", line 233, in run_job 
    runner.run() 
    File "/usr/lib/python3.4/site-packages/mrjob/runner.py", line 511, in run 
    self._run() 
    File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 144, in _run 
    self._run_mappers_and_combiners(step_num, map_splits) 
    File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 185, in _run_mappers_and_combiners 
    for task_num, map_split in enumerate(map_splits) 
    File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 120, in _run_multiple 
    func() 
    File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 662, in _run_mapper_and_combiner 
    run_mapper() 
    File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 685, in _run_task 
    stdin, stdout, stderr, wd, env) 
    File "/usr/lib/python3.4/site-packages/mrjob/inline.py", line 92, in invoke_task 
    task.execute() 
    File "/usr/lib/python3.4/site-packages/mrjob/job.py", line 433, in execute 
    self.run_mapper(self.options.step_num) 
    File "/usr/lib/python3.4/site-packages/mrjob/job.py", line 517, in run_mapper 
    for out_key, out_value in mapper(key, value) or(): 
    File "RatingsBreakdown.py", line 13, in mapper_get_ratings 
    (userID, movieID, rating, timestamp) = line.split('\t') 
ValueError: need more than 1 value to unpack 
私は、エラーを解決し、NYの間違いがあるかを理解したいと思います私のMovieRatings

[[email protected] Desktop]# python3 MovieRatings.py UN.txt 
No configs found; falling back on auto-configuration 
No configs specified for inline runner 
Running step 1 of 1... 
Creating temp directory /tmp/MovieRatings.training.20171128.083635.368889 
Error while reading from /tmp/MovieRatings.training.20171128.083635.368889/step/000/reducer/00000/input: 
Traceback (most recent call last): 
    File "MovieRatings.py", line 20, in <module> 
    MovieRatings.run() 
    File "/usr/lib/python3.4/site-packages/mrjob/job.py", line 424, in run 
    mr_job.execute() 
    File "/usr/lib/python3.4/site-packages/mrjob/job.py", line 445, in execute 
    super(MRJob, self).execute() 
    File "/usr/lib/python3.4/site-packages/mrjob/launch.py", line 185, in execute 
    self.run_job() 
    File "/usr/lib/python3.4/site-packages/mrjob/launch.py", line 233, in run_job 
    runner.run() 
    File "/usr/lib/python3.4/site-packages/mrjob/runner.py", line 511, in run 
    self._run() 
    File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 150, in _run 
    self._run_reducers(step_num, num_reducer_tasks) 
    File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 246, in _run_reducers 
    for task_num in range(num_reducer_tasks) 
    File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 120, in _run_multiple 
    func() 
    File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 685, in _run_task 
    stdin, stdout, stderr, wd, env) 
    File "/usr/lib/python3.4/site-packages/mrjob/inline.py", line 92, in invoke_task 
    task.execute() 
    File "/usr/lib/python3.4/site-packages/mrjob/job.py", line 439, in execute 
    self.run_reducer(self.options.step_num) 
    File "/usr/lib/python3.4/site-packages/mrjob/job.py", line 560, in run_reducer 
    for out_key, out_value in reducer(key, values) or(): 
    File "MovieRatings.py", line 17, in reducer_count_ratings 
    yield Key, sum(values) 
NameError: name 'Key' is not defined 

と も

+1

「Key」の代わりに「key」? – Mel

+0

'steps()'の末尾に ']'が付いた字下げ問題がありますが、いくつかの問題があるようです。 – cdarke

答えて

0

このライブラリは唯一まず、あなたがRatingsBreakdown.pyを走ったのpython3

File "RatingsBreakdown.py", line 13, in mapper_get_ratings 
    (userID, movieID, rating, timestamp) = line.split('\t') 
ValueError: need more than 1 value to unpack 

で働くように思える...また、あなたの示す入力にはタブが含まれていない、あなたは4列を抽出することを試みました。あなたがここで期待したものは本当に明確ではありません。

File "MovieRatings.py", line 17, in reducer_count_ratings 
    yield Key, sum(values) 
NameError: name 'Key' is not defined 

自己説明...あなたの変数がkey

+0

私は完全にそれに答えることができなかったと思います。次回はよく質問を説明しようとします。ご協力いただきありがとうございます –

0

私は同じではなく、手順の機能を使用せずに働きました。出来た。

from mrjob.job import MRJob 

class wordcount(MRJob): 

    def mapper(self, _, line): 
     (word) = line.split(' ') 
     yield word, 1 

    def reducer(self,x,count): 
     yield x,sum(count) 

if __name__ == '__main__': 
    wordcount.run() 
関連する問題