2

私は今回はdownvotedしないことを願っています。私はしばらく(2日、正確に)Pythonでの並列処理に苦労しています。私はこれらのリソースをチェックしています(リストの一部を以下に示します。Pythonの並列処理のより良い例

マスター:。

(a)のhttp://eli.thegreenplace.net/2013/01/16/python-paralellizing-cpu-bound-tasks-with-concurrent-futures

(b)はhttps://pythonadventures.wordpress.com/tag/processpoolexecutor/

は、私は私がやりたいことはこれです剥がれてきました:

Break up the file into chunks(strings or numbers) 
Broadcast a pattern to be searched to all the workers 
Receive the offsets in the file where the pattern was found 

労働者:

Receive pattern and chunk of text from the master 
Compute() 
Send back the offsets to the master. 

私はこれをMPI/concurrent.futures/multiprocessingを使用して実装しようとしました。

私の素朴な実装で、私が特にconcurrent.futuresを持つ任意の指導のために感謝されるマルチプロセッシングモジュール

import multiprocessing 

filename = "file1.txt" 
pat = "afow" 
N = 1000 

""" This is the naive string search algorithm""" 

def search(pat, txt): 

    patLen = len(pat) 
    txtLen = len(txt) 
    offsets = [] 

    # A loop to slide pattern[] one by one 
    # Range generates numbers up to but not including that number 
    for i in range ((txtLen - patLen) + 1): 

    # Can not use a for loop here 
    # For loops in C with && statements must be 
    # converted to while statements in python 
     counter = 0 
     while(counter < patLen) and pat[counter] == txt[counter + i]: 
      counter += 1 
      if counter >= patLen: 
       offsets.append(i) 
     return str(offsets).strip('[]') 

     """" 
     This is what I want 
if __name__ == "__main__": 
    tasks = [] 
    pool_outputs = [] 
    pool = multiprocessing.Pool(processes=5) 
    with open(filename, 'r') as infile: 
      lines = [] 
      for line in infile: 
       lines.append(line.rstrip()) 
       if len(lines) > N: 
        pool_output = pool.map(search, tasks) 
        pool_outputs.append(pool_output) 
        lines = [] 
       if len(lines) > 0: 
        pool_output = pool.map(search, tasks) 
        pool_outputs.append(pool_output) 
    pool.close() 
    pool.join() 
    print('Pool:', pool_outputs) 
     """"" 

with open(filename, 'r') as infile: 
    for line in infile: 
     print(search(pat, line)) 

を使用。御時間ありがとうございます。彼の追加でValeriyが私を助け、私は彼に感謝します。

しかし、誰もがちょっと私にふけることができれば、これは(私がどこかで見た例をオフに取り組んで)私はconcurrent.futuresのために働いていたコードである

from concurrent.futures import ProcessPoolExecutor, as_completed 
import math 

def search(pat, txt): 

    patLen = len(pat) 
    txtLen = len(txt) 
    offsets = [] 

# A loop to slide pattern[] one by one 
# Range generates numbers up to but not including that number 
    for i in range ((txtLen - patLen) + 1): 

    # Can not use a for loop here 
    # For loops in C with && statements must be 
    # converted to while statements in python 
     counter = 0 
     while(counter < patLen) and pat[counter] == txt[counter + i]: 
      counter += 1 
      if counter >= patLen: 
       offsets.append(i) 
return str(offsets).strip('[]') 

#Check a list of strings 
def chunked_worker(lines): 
    return {0: search("fmo", line) for line in lines} 


def pool_bruteforce(filename, nprocs): 
    lines = [] 
    with open(filename) as f: 
     lines = [line.rstrip('\n') for line in f] 
    chunksize = int(math.ceil(len(lines)/float(nprocs))) 
    futures = [] 

    with ProcessPoolExecutor() as executor: 
     for i in range(nprocs): 
      chunk = lines[(chunksize * i): (chunksize * (i + 1))] 
      futures.append(executor.submit(chunked_worker, chunk)) 

    resultdict = {} 
    for f in as_completed(futures): 
     resultdict.update(f.result()) 
    return resultdict 


filename = "file1.txt" 
pool_bruteforce(filename, 5) 

おかげで再び、バレリーと誰誰が私の謎を解決するのを手助けしようとしている。

答えて

0

あなたはとてもいくつかの引数、使用している:

import multiprocessing 
from functools import partial 
filename = "file1.txt" 
pat = "afow" 
N = 1000 

""" This is the naive string search algorithm""" 

def search(pat, txt): 
    patLen = len(pat) 
    txtLen = len(txt) 
    offsets = [] 

    # A loop to slide pattern[] one by one 
    # Range generates numbers up to but not including that number 
    for i in range ((txtLen - patLen) + 1): 

    # Can not use a for loop here 
    # For loops in C with && statements must be 
    # converted to while statements in python 
     counter = 0 
     while(counter < patLen) and pat[counter] == txt[counter + i]: 
      counter += 1 
      if counter >= patLen: 
       offsets.append(i) 
     return str(offsets).strip('[]') 


if __name__ == "__main__": 
    tasks = [] 
    pool_outputs = [] 
    pool = multiprocessing.Pool(processes=5) 
    lines = [] 
    with open(filename, 'r') as infile: 
     for line in infile: 
      lines.append(line.rstrip())     
    tasks = lines 
    func = partial(search, pat) 
    if len(lines) > N: 
     pool_output = pool.map(func, lines) 
     pool_outputs.append(pool_output)  
    elif len(lines) > 0: 
     pool_output = pool.map(func, lines) 
     pool_outputs.append(pool_output) 
    pool.close() 
    pool.join() 
    print('Pool:', pool_outputs) 
+0

バレリー:感謝を。とにかく部分的に何をするのですか?あなたは、徹底的にPythonでの並列処理に取り組むリソースを知っていますか?再度、感謝します。 – corax

+0

https://docs.python.org/2/library/functools.html#functools.partial –

+0

Valeriy:私はそれを読んで本当に理解できませんでした。申し訳ありませんが、私は関数の適切な例を意味していました。ありがとう。 – corax