python-pandasのgroupbyを配列に割り当てる方法は？

はそのようなデータフレームdfを考える：python-pandasのgroupbyを配列に割り当てる方法は？

a  b  
2  nan 
3  nan 
3  nan 
4  nan 
4  nan 
4  nan 
5  nan 
5  nan 
5  nan 
5  nan 
...

重要なルールがあることa繰り返しn-1行の各番号n。そして、私の予想される出力は次のとおりです。

したがってbで数mは1からn-1のリストです。私はこのようにしてみました：

df.groupby('a').apply(lambda x: np.asarray(range(x['a'].unique()[0])))

しかし結果は、私が望むものではない一つの行のリストです。

実装方法を教えてください。前もって感謝します！

出典

2016-09-28 ZICHAO LI

あなたがcumcount必要があります：あなたの偉大な答えを

df['b'] = df.groupby('a').cumcount() + 1 
print (df) 
    a b 
0 2 1 
1 3 1 
2 3 2 
3 4 1 
4 4 2 
5 4 3 
6 5 1 
7 5 2 
8 5 3 
9 5 4

出典

2016-09-28 14:06:34 jezrael

# make a column that is 0 on the first occurrence of a number in a and 1 after 
df['is_duplicated'] = df.duplicated(['a']).astype(int) 

# group by values of a and get the cumulative sum of duplicates 
# add one since the first duplicate has a value of 0 
df['b'] = df[['a', 'is_duplicated']].groupby(['a']).cumsum() + 1

出典

2016-09-28 14:04:13 scomes

感謝を！素晴らしい！ –

python-pandasのgroupbyを配列に割り当てる方法は？

答えて

関連する問題