2016-11-18 2 views
0

の行の数を抽出し、次のようにdfを呼ばれるデータフレームがある:のpython3パンダ:(名前など)の列によってグループ化されたデータフレームは、各グループ

name id age    text 
a  1  1 very good, and I like him 
b  2  2 I play basketball with his brother 
c  3  3 I hope to get a offer 
d  4  4 everything goes well, I think 
a  1  1 I will visit china 
b  2  2 no one can understand me, I will solve it 
c  3  3 I like followers 
d  4  4 maybe I will be good 
a  1  1 I should work hard to finish my research 
b  2  2 water is the source of earth, I agree it 
c  3  3 I hope you can keep in touch with me 
d  4  4 My baby is very cute, I like him 

データフレームが名前によってグループ化されています新しいデータフレーム(df_new)に対していくつかの行インデックス(たとえば:2)を抽出したいと考えています。

name id age    text 
a  1  1 very good, and I like him 
a  1  1 I will visit china 
b  2  2 I play basketball with his brother 
b  2  2 no one can understand me, I will solve it 
c  3  3 I hope to get a offer 
c  3  3 I like followers 
d  4  4 everything goes well, I think 
d  4  4 maybe I will be good 



    df_new = (df.groupby('screen_name'))[0:2] 

しかし、エラーがある:

hash(key) 
    TypeError: unhashable type: 'slice' 

答えて

1

()の代わりにヘッドを使用してみてください。 [2]の代わりにヘッド()を使用して

import pandas as pd 
from io import StringIO 

buff = StringIO(''' 
name,id,age,text 
a,1,1,"very good, and I like him" 
b,2,2,I play basketball with his brother 
c,3,3,I hope to get a offer 
d,4,4,"everything goes well, I think" 
a,1,1,I will visit china 
b,2,2,"no one can understand me, I will solve it" 
c,3,3,I like followers 
d,4,4,maybe I will be good 
a,1,1,I should work hard to finish my research 
b,2,2,"water is the source of earth, I agree it" 
c,3,3,I hope you can keep in touch with me 
d,4,4,"My baby is very cute, I like him" 
''') 
df = pd.read_csv(buff) 

次に名前でソートを

df_new = df.groupby('name').head(2).sort_values('name') 
print(df_new) 
    name id age          text 
0 a 1 1     very good, and I like him 
4 a 1 1       I will visit china 
1 b 2 2   I play basketball with his brother 
5 b 2 2 no one can understand me, I will solve it 
2 c 3 3      I hope to get a offer 
6 c 3 3       I like followers 
3 d 4 4    everything goes well, I think 
7 d 4 4      maybe I will be good 
1

ilocと別の解決策:

df_new = df.groupby('name').apply(lambda x: x.iloc[:2]).reset_index(drop=True) 
print(df_new) 
    name id age          text 
0 a 1 1     very good, and I like him 
1 a 1 1       I will visit china 
2 b 2 2   I play basketball with his brother 
3 b 2 2 no one can understand me, I will solve it 
4 c 3 3      I hope to get a offer 
5 c 3 3       I like followers 
6 d 4 4    everything goes well, I think 
7 d 4 4      maybe I will be good 
関連する問題