は、私はあなたがmap
が必要だと思う:
A = pd.DataFrame({'Name':['John','Max','Joe']})
print (A)
Name
0 John
1 Max
2 Joe
d = {'John':20,'Max':25,'Jack':30}
A1 = A.copy(deep=True)
A1['Age'] = A.Name.map(d)
print (A1)
Name Age
0 John 20.0
1 Max 25.0
2 Joe NaN
もし必要機能:
d = {'John':20,'Max':25,'Jack':30}
def age(df):
new_df = df.copy(deep=True)
new_df['Age'] = new_df.Name.map(d)
return new_df
new_A = age(A)
print (new_A)
Name Age
0 John 20.0
1 Max 25.0
2 Joe NaN
タイミング:
In [191]: %timeit (age(A))
10 loops, best of 3: 21.8 ms per loop
In [192]: %timeit (jul(A))
10 loops, best of 3: 47.6 ms per loop
コードのタイミングについて:
A = pd.DataFrame({'Name':['John','Max','Joe']})
#[300000 rows x 2 columns]
A = pd.concat([A]*100000).reset_index(drop=True)
print (A)
d = {'John':20,'Max':25,'Jack':30}
def age(df):
new_df = df.copy(deep=True)
new_df['Age'] = new_df.Name.map(d)
return new_df
def jul(A):
df = pd.DataFrame({'Name': list(d.keys()), 'Age': list(d.values())})
A1 = pd.merge(A, df, how='left')
return A1
A = pd.DataFrame({'Name':['John','Max','Joe']})
#[300 rows x 2 columns]
A = pd.concat([A]*100).reset_index(drop=True)
In [194]: %timeit (age(A))
The slowest run took 5.22 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 742 µs per loop
In [195]: %timeit (jul(A))
The slowest run took 4.51 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 1.87 ms per loop
驚くほど宜しくお願い致します。そのコードは即座に動作します – Asim