パンダのデータフレームにラムダ関数を適用しているときに発生するキーエラー

私はパンダのデータフレームにK平均クラスタリングを適用しています。パンダのデータフレームにラムダ関数を適用しているときに発生するキーエラー

1945     return self._engine.get_loc(key) 
    1946    except KeyError: 
-> 1947     return   self._engine.get_loc(self._maybe_cast_indexer(key)) 
    1948 
    1949   indexer = self.get_indexer([key], method=method, tolerance=tolerance) 

pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4154)() 

pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4018)() 

pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item  (pandas\hashtable.c:12368)() 

pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12322)() 

KeyError: (0, 'occurred at index 0')

誰かがエラーの理由に、どのように私のように説明していてくださいすることができ：ラムダ関数を使用している間

def assign_to_cluster(row): 
    lowest_distance = -1 
    closest_cluster = -1 

    for cluster_id, centroid in centroids_dict.items(): 
     df_row = [row['PPG'],row['ATR']] 
     euclidean_distance = calculate_distance(centroids, df_row) 

     if lowest_distance == -1: 
      lowest_distance = euclidean_distance 
      closest_cluster = cluster_id 
     elif euclidean_distance < lowest_distance: 
      lowest_distance = euclidean_distance 
      closest_cluster = cluster_id 
    return closest_cluster 

point_guards['CLUSTER'] = point_guards.apply(lambda row: assign_to_cluster(row), axis=1)

しかし、私は次のエラーを取得する：クラスタの割り当て機能は以下の通りです。それを解決することはできますか？追加情報が必要な場合は、この投稿に返信してください。また、書式設定についてお詫び申し上げます。 StackOverflowで質問をするのは初めてです。

出典

2017-02-21 Aditya Gogoi

point_guards.head（）とは何ですか？ – putonspectacles

参照：http://stackoverflow.com/questions/16353729/pandas-how-to-use-apply-function-to-multiple-columns – putonspectacles

@putonspectacles：point_guardsは私が取り組んでいるパンダのデータフレームの名前です。 head（）関数は、データフレームの最初の10行を出力します。少なくとも、それは私がそう思うものです。 –

私は単純な構文エラーを作り出したことが判明しました。代わりに、関数を呼び出している間「）centroid_dict.items（」「calculate_distance」を辞書の「重心」部分を使用する：それが解決される

for cluster_id, centroid in centroids_dict.items(): 
    df_row = [row['PPG'],row['ATR']] 
    euclidean_distance = calculate_distance(centroids, df_row)

：

for cluster_id, centroid in centroids_dict.items(): 
    df_row = [row['PPG'],row['ATR']] 
    euclidean_distance = calculate_distance(centroid, df_row) 
....

は私の代わりに「重心」を使用しました今でも。

出典

2017-02-21 15:35:33

パンダのデータフレームにラムダ関数を適用しているときに発生するキーエラー

答えて

関連する問題