Pythonデータフレーム：文字列と浮動小数点列の値に基づいて新しい列を作成

私は以下のPythonデータフレームを持っています。 "フラグ"フィールドは、コードで作成したい私の希望する列です。Pythonデータフレーム：文字列と浮動小数点列の値に基づいて新しい列を作成

「割り当てタイプ」が予測され、「Activities_Counter」は10以上であるされている場合は、私は「旗」と呼ばれる新しい列を作成し、ラベルを付けたい：私は次の操作を実行したい

「フラグ」を含む行

それ以外の場合は、フラグ行を空白のままにします。

"Activities_Counter"が10より大きい場合、次のコードを使用して/フラグを設定しますが、 "割り当てタイプ"の基準を自分のコードに組み込む方法がわかりません。

Flag = [] 

for row in df_HA_noHA_act['Activities_Counter']: 
    if row >= 10: 
     Flag.append('Flag') 
    else: 
     Flag.append('') 

df_HA_noHA_act['Flag'] = Flag

ご協力いただきありがとうございます。

出典

2017-05-22 PineNuts0

&で新しい条件を追加する必要があります。また、より高速なnumpy.whereを使用している：

mask = (df_HA_noHA_act["Allocation Type"] == 'Predicted') & 
     (df_HA_noHA_act['Activities_Counter'] >= 10) 
df_HA_noHA_act['Flag'] = np.where(mask, 'Flag', '')

df_HA_noHA_act = pd.DataFrame({'Activities_Counter':[10,2,6,15,11,18], 
           'Allocation Type':['Historical','Historical','Predicted', 
                'Predicted','Predicted','Historical']}) 
print (df_HA_noHA_act) 
    Activities_Counter Allocation Type 
0     10  Historical 
1     2  Historical 
2     6  Predicted 
3     15  Predicted 
4     11  Predicted 
5     18  Historical 

mask = (df_HA_noHA_act["Allocation Type"] == 'Predicted') & 
     (df_HA_noHA_act['Activities_Counter'] >= 10) 
df_HA_noHA_act['Flag'] = np.where(mask, 'Flag', '') 
print (df_HA_noHA_act) 
    Activities_Counter Allocation Type Flag 
0     10  Historical  
1     2  Historical  
2     6  Predicted  
3     15  Predicted Flag 
4     11  Predicted Flag 
5     18  Historical

ループ遅いソリューション：

Flag = [] 
for i, row in df_HA_noHA_act.iterrows(): 
    if (row['Activities_Counter'] >= 10) and (row["Allocation Type"] == 'Predicted'): 
     Flag.append('Flag') 
    else: 
     Flag.append('') 
df_HA_noHA_act['Flag'] = Flag 
print (df_HA_noHA_act) 
    Activities_Counter Allocation Type Flag 
0     10  Historical  
1     2  Historical  
2     6  Predicted  
3     15  Predicted Flag 
4     11  Predicted Flag 
5     18  Historical

タイミング：

df_HA_noHA_act = pd.DataFrame({'Activities_Counter':[10,2,6,15,11,18], 
           'Allocation Type':['Historical','Historical','Predicted', 
                'Predicted','Predicted','Historical']}) 
print (df_HA_noHA_act) 
#[6000 rows x 2 columns] 
df_HA_noHA_act = pd.concat([df_HA_noHA_act]*1000).reset_index(drop=True) 

In [187]: %%timeit 
    ...: df_HA_noHA_act['Flag1'] = np.where((df_HA_noHA_act["Allocation Type"] == 'Predicted') & (df_HA_noHA_act['Activities_Counter'] >= 10), 'Flag', '') 
    ...: 
100 loops, best of 3: 1.89 ms per loop 

In [188]: %%timeit 
    ...: Flag = [] 
    ...: for i, row in df_HA_noHA_act.iterrows(): 
    ...:  if (row['Activities_Counter'] >= 10) and (row["Allocation Type"] == 'Predicted'): 
    ...:   Flag.append('Flag') 
    ...:  else: 
    ...:   Flag.append('') 
    ...: df_HA_noHA_act['Flag'] = Flag 
    ...: 
    ...: 
1 loop, best of 3: 381 ms per loop

出典

2017-05-22 10:13:30 jezrael

を完全に働きました！ありがとうございます:) – PineNuts0

タイミングは、あなたのコードをより速く走らせることができるコンピュータサイエンスのコンポーネントですか？ – PineNuts0

私はそれが最速の解決策だと思う、私は自分のPCでそれをテストする。 – jezrael

Pythonデータフレーム：文字列と浮動小数点列の値に基づいて新しい列を作成

答えて

関連する問題