シリーズオブジェクト（パンダ）内のデータフレームの平均を計算する方法は？

Iは、以下の構造を有するデータフレームを有する：ツイート直列内部シリーズオブジェクト（パンダ）内のデータフレームの平均を計算する方法は？

df.columns 
Index(['first_post_date', 'followers_count', 'friends_count', 
     'last_post_date','min_retweet', 'retweet_count', 'screen_name', 
     'tweet_count', 'tweet_with_max_retweet', 'tweets', 'uid'], 
     dtype='object')

を、各セルは全てツイートユーザのを含む別のデータフレームです。

df.tweets[0].columns 
Index(['created_at', 'id', 'retweet_count', 'text'], dtype='object')

私は、各ユーザーのツイートに計算を実行します。

たとえば、各ユーザーの平均リトウェイズ数と最大リトワーブ数のツイートを確認するにはどうすればよいですか？

出典

2017-05-26 Rakib

使用[DataFrame.groupby]（HTTPS：//pandas.pydata .org/pandas-docs/stable/generated/pandas.DataFrame.groupby.html） –

たぶん、このようなものが役立つだろう：

df = pd.DataFrame({'id': [0, 1, 2], 
        'tweets': [pd.DataFrame({'id': [0, 1], 'retweet_count': [5, 10]}), 
           pd.DataFrame({'id': [2, 3], 'retweet_count': [55, 100]}), 
           pd.DataFrame({'id': [4, 5], 'retweet_count': [5555, 1000]})]}) 


stats = df['tweets'].apply(lambda x: pd.Series([x.retweet_count.max(), 
               x.retweet_count.mean()], 
               index=['max', 'mean']))

結果は列が各ユーザーの統計情報ですデータフレームである。

 max mean 
0 10.0  7.5 
1 100.0 77.5 
2 5555.0 3277.5

出典

2017-05-26 18:29:35

シリーズオブジェクト（パンダ）内のデータフレームの平均を計算する方法は？

答えて

関連する問題