pandasを使用してpivot_tableを作成するとエラーが発生する

私はこのようなデータフレームの頭を持ち、pivot_tableを作りたいと思う。pandasを使用してpivot_tableを作成するとエラーが発生する

user_id  item_id cate_id action_type action_date 
0 11482147 492681 1_11 view   15 
1 12070750 457406 1_14 deep_view  15 
2 12431632 527476 1_1  view   15 
3 13397746 531771 1_6  deep_view  15 
4 13794253 510089 1_27 deep_view  15

20000+ user_id、37 cate_id、5 action_typeがあります。私はexcel.Theテーブルの値はすべてのcate_idとすべてのuser_idのvalue_countする必要がありますこのようなpivot_tableをしたいと思います。 pivot_table 次のコードを試しました。

user_cate_table = pd.pivot_table(user_cate_table2,index = ['user_id','cate_id'],columns=np.unique(train['action_type']),values='action_type',aggfunc=np.count_nonzero,fill_value=0)

私はこのメッセージを受け取りました。

ValueError: Grouper and axis must be same length

データフレームuser_cate_table2の先頭。

user_id  item_id cate_id action_type 
0 11482147 492681 1_11 1.0 
1 12070750 457406 1_14 2.0 
2 12431632 527476 1_1  1.0 
3 13397746 531771 1_6  2.0 
4 13794253 510089 1_27 2.0 
5 14378544 535335 1_6  2.0 
6 1705634  535202 1_10 1.0 
7 6943823  478183 1_3  2.0 
8 5902475  524378 1_6  1.0

出典

2017-05-31 Chunk_Ning

を誤解しましたか？別のデータフレームを持つカラムのユニークな値ですか？ – jezrael

はい、trainという名前の元のデータフレームです。これは上に示したとおりです –

2つのデータフレームがあり、ピボットテーブルが必要ですか？ 2番目のデータフレームのサンプルを目的の出力に追加できますか？ありがとうございました。 – jezrael

私はあなたがgroupby + size + unstackが必要だと思う：pivot_tableと

df1 = df.groupby(['user_id','cate_id', 'action_type']).size().unstack(fill_value=0) 
print (df1) 
action_type  deep_view view 
user_id cate_id     
11482147 1_11    0  1 
12070750 1_14    1  0 
12431632 1_1    0  1 
13397746 1_6    1  0 
13794253 1_27    1  0

別の解決策：

df1 = df.pivot_table(index=['user_id','cate_id'], 
        columns='action_type', 
        values='item_id', 
        aggfunc=len, 
        fill_value=0) 
print (df1) 
action_type  deep_view view 
user_id cate_id     
11482147 1_11    0  1 
12070750 1_14    1  0 
12431632 1_1    0  1 
13397746 1_6    1  0 
13794253 1_27    1  0

出典

2017-05-31 13:11:55 jezrael

私は@ jezraelと同じ回答になりました。私は私の人生を編集しながら投稿しました。 –

@MaartenFabré - ありがとうございます。私はあなたの答えをチェックし、それは少し異なりますが、私はOPは私のソリューションやあなたのようなものが欲しいかどうか分かりません。 – jezrael

'.size'は' .np.count_nonzero'を使用していると思いますので、同じ値になり、 'fill_value'を使用しませんでした。残りは同等です –

あなたがpivot_tableを使用する必要はありません。あなたはgroupby使用することができますし、unstack

df.groupby(['user_id', 'cate_id', 'action_type'])['action_date'].agg(np.count_nonzero).unstack('action_type')

pivot_tableはあまりにも動作しますが、ではない、あなたが `np.unique（電車[ 'ACTION_TYPEを']）`意味は何columns=パラメータ

pd.pivot_table(df,index = ['user_id','cate_id'],columns=['action_type'],aggfunc=np.count_nonzero,fill_value=0)

出典

2017-05-31 13:18:03

pandasを使用してpivot_tableを作成するとエラーが発生する

答えて

関連する問題