DataFrameGroupByオブジェクトのモードを計算するときのエラー

私はDate列のデータフレームを持っています。私は年ごとにデータをグループ化し、平均と中央値を計算できます。しかし、どのようにモードを計算するのですか？ここで私が得るエラーは次のとおりです。DataFrameGroupByオブジェクトのモードを計算するときのエラー

>>> np.random.seed(0) 
>>> rng = pd.date_range('2010-01-01', periods=10, freq='2M') 
>>> df = pd.DataFrame({ 'Date': rng, 'Val': np.random.random_integers(0,100,size=10) }) 
>>> df 
     Date Val 
0 2010-01-31 44 
1 2010-03-31 47 
2 2010-05-31 64 
3 2010-07-31 67 
4 2010-09-30 67 
5 2010-11-30 9 
6 2011-01-31 83 
7 2011-03-31 21 
8 2011-05-31 36 
9 2011-07-31 87 
>>> df.groupby(pd.Grouper(key='Date',freq='A')).mean() 
        Val 
Date     
2010-12-31 49.666667 
2011-12-31 56.750000 
>>> df.groupby(pd.Grouper(key='Date',freq='A')).median() 
      Val 
Date    
2010-12-31 55.5 
2011-12-31 59.5 
>>> df.groupby(pd.Grouper(key='Date',freq='A')).mode() 

Traceback (most recent call last): 
    File "<pyshell#109>", line 1, in <module> 
    df.groupby(pd.Grouper(key='Date',freq='A')).mode() 
    File "C:\Python27\lib\site-packages\pandas\core\groupby.py", line 554, in __getattr__ 
    return self._make_wrapper(attr) 
    File "C:\Python27\lib\site-packages\pandas\core\groupby.py", line 571, in _make_wrapper 
    raise AttributeError(msg) 
AttributeError: Cannot access callable attribute 'mode' of 'DataFrameGroupBy' objects, try using the 'apply' method

出典

2017-01-02 user2314737

モードは、DataFrameGroupByオブジェクトに実装されていないカスタム関数の

使用np.apply_along_axis：https://github.com/pandas-dev/pandas/issues/6978。 – TobiasWeis

return_countsパラメータで使用np.unique。
counts配列のargmaxを使用して、一意の配列から値を取得します。推論と、ここで説明したようにmode

def mode(a): 
    u, c = np.unique(a, return_counts=True) 
    return u[c.argmax()] 

df.groupby(pd.Grouper(key='Date',freq='A')).Val.apply(mode) 

Date 
2010-12-31 67 
2011-12-31 21 
Freq: A-DEC, Name: Val, dtype: int64

出典

2017-01-02 17:35:08 piRSquared

modeはパンダのGROUPBYオブジェクトと自動的に互換性のある機能に組み込まれていません。 scipy.statsモジュールを使用できます。しかし、これはちょっとした感じです。

from scipy import stats 

df.groupby(pd.Grouper(key='Date',freq='A')).apply(stats.mode)

また、あなたはvalue_counts()機能を使用して、返された最初のインデックス値を取ることができます。これが私が行くルートです。

df.groupby(pd.Grouper(key='Date', freq='A')).value_counts()[0].index.values[0]

出典

2017-01-02 17:33:07 3novak

DataFrameGroupByオブジェクトのモードを計算するときのエラー

答えて

関連する問題