numpyを使用してファイルを読み込む処理時間を短縮する方法

ファイルを読み込んでいくつかの値を比較し、繰り返されたもののインデックスを見つけ出し、それらのインデックスを削除します。私はwhileループでこの処理を行っています。これは約76秒の処理時間を要しています。ループ処理は、より多くの時間を割いているnumpyを使用してファイルを読み込む処理時間を短縮する方法

Source = np.empty(shape=[0,7]) 
Source = CalData (# CalData is the log file data) 
CalTab = np.empty(shape=[0,7]) 
Source = Source[Source[:, 4].argsort()] # Sort by Azimuth 
while Source.size >=1: 
    temp = np.logical_and(Source[:,4]==Source[0,4],Source[:,5]==Source[0,5])  
    selarrayindex = np.argwhere(temp) # find indexes 
    selarray = Source[temp] 
    CalTab = np.append(CalTab, [selarray[selarray[:,6].argsort()][-1]], axis=0) 
    Source = np.delete(Source, selarrayindex, axis=0) #delete other rows with similar AZ, EL

ながら：はここに私のコードです。 numpyまたはEfficient numpyを使用して他の方法（通常のPythonを使用）を使用助けてください！いずれの場合においても

出典

2017-12-01 Tirumala

あなたはパンダまたは類似のライブラリに探してみましたか？ – kshikama

@kshikamaはありません。私はnumpyまたは普通のpython（ファイル操作を使ってカラムを見つけるのと同じように）だけを使いたい。 – Tirumala

Thnikあなたは[mcve]に質問をする必要があります。あなたのアルゴリズム（ '' CalTab'）には何が入り、何が欲しいのですか（ '' CalTab''）？どのような書式、形、大きさなどがあります。私はあなたのコードから見て、今はあまり意味のない形の '（0,7）'の空の配列です。特に重要なのは配列の 'dtype'です。これは' numpy'での操作方法を駆動するためです –

が、これはあなたのタイミングをimporveする必要があり、私は思う：

def lex_pick(Source): 
    idx = np.lexsort((Source[:, 6], Source[:, 5], Source[:, 4])) 
         # indices to sort by columns 4, then 5, then 6 
    # if dtype = float 
    mask = np.r_[np.logical_not(np.isclose(Source[idx[:-1], 5], Source[idx[1:], 5])), True] 
    # if dtype = int or string 
    mask = np.r_[Source[idx[:-1], 5] != Source[idx[1:], 5], True] 
         # `mask` is `True` in rows before where column 5 changes 
    return Source[idx[mask], 6]

出典

2017-12-01 07:26:06

numpyを使用してファイルを読み込む処理時間を短縮する方法

答えて

関連する問題