説明に合ったデータシナリオを設定します。
>>> df = pd.DataFrame({'restaurant':['Freddys', 'Freddys', 'Jumpin Java', 'Freddys', 'Jumpin Java', 'Caffe Low', 'Kitchen 2'],
'customer': ['John', 'John', 'Paula', 'John', 'Justin', 'Paula', 'Paula'],
'date':['1-1-17', '1-2-17', '1-3-17', '1-4-17', '1-5-17', '1-6-17', '1-7-17']})
customer date restaurant
0 John 1-1-17 Freddys
1 John 1-2-17 Freddys
2 Paula 1-3-17 Jumpin Java
3 John 1-4-17 Freddys
4 Justin 1-5-17 Jumpin Java
5 Paula 1-6-17 Caffe Low
6 Paula 1-7-17 Kitchen 2
指定した条件を返す関数を作成します。
def get_eating_pattern(df):
for name in df.customer.unique():
three_visits = 0
total_visits = 0
unique_rests = 0
three_visits = df.loc[df['customer'] == name]['restaurant'].value_counts()[0]
if '3' in str(three_visits):
print(name, 'went to the same restaurant 3 times.')
total_visits = df.loc[df['customer'] == name]['restaurant'].value_counts().sum()
unique_rests = df.loc[df['customer'] == name]['restaurant'].nunique()
if total_visits == 3 & unique_rests == 3:
print(name, 'went to 3 different restaurants.')
テスト機能は、それは我々がdf
の内容に基づいて、期待したものと一致することを確認します。
>>> get_eating_pattern(df=df)
John went to the same restaurant 3 times.
Paula went to 3 different restaurants.
これは私は正しい軌道に乗りました! ( 'customer': 'size'、 'date': 'nunique'})。rename(columns = {df_out = df.groupby( 'customer')} 'customer': 'Num_Visits'、 'date': "Num_Dates")) ' – warvolin