パンダのデータフレーム - LIKE「DF」と呼ばれるパンダのデータフレームである条件

で最長の連続した行を見つけ、私は列「A」列の最長の連続した数であるスライスを返すようにしたいパンダのデータフレーム - LIKE「DF」と呼ばれるパンダのデータフレームである条件

   A 
2015-05-01 True 
2015-05-02 True 
2015-05-03 False 
2015-05-04 False 
2015-05-05 False 
2015-05-06 False 
2015-05-07 True 
2015-05-08 False 
2015-05-09 False

を次の「False」と読みます。これはできますか？

出典

2016-10-16 Runner Bean

A列の変更を検出するのにcumsumを使用すると、booleanというようにpythonを集計できます。

# Test data 
df= DataFrame([True, True, False, False, False, False, True, False, False], 
       index=pd.to_datetime(['2015-05-01', '2015-05-02', '2015-05-03', 
            '2015-05-04', '2015-05-05', '2015-05-06', 
            '2015-05-07', '2015-05-08', '2015-05-09']), 
       columns=['A']) 

# We have to ensure that the index is sorted 
df.sort_index(inplace=True) 
# Resetting the index to create a column 
df.reset_index(inplace=True) 

# Grouping by the cumsum and counting the number of dates and getting their min and max 
df = df.groupby(df['A'].cumsum()).agg(
    {'index': ['count', 'min', 'max']}) 

# Removing useless column level 
df.columns = df.columns.droplevel() 

print(df) 
# count  min  max 
# A        
# 1  1 2015-05-01 2015-05-01 
# 2  5 2015-05-02 2015-05-06 
# 3  3 2015-05-07 2015-05-09 

# Getting the max 
df[df['count']==df['count'].max()] 

# count  min  max 
# A        
# 2  5 2015-05-02 2015-05-06

出典

2016-10-16 08:44:06 Romain

素晴らしいが、私は、その指標として「日付」を使用しカントと私はdf.indexしようとする代わりに、私はTypeError例外を取得：非ハッシュタイプ：「DatetimeIndex」 –

私のDFは、実際にデータフレームを持っているSDATAと呼ばれるオブジェクトでありますメンバーとして 'df'。だから、もし私が 'df.index'または 'sData.df.index'を試しても、私は 'index'を試してみるとエラーになる。私もエラーが出る。私はどのように書くべきかを知っていませんが、agg関数に 'index'を書きます。 –

@RunnerBean私の例で示すように、インデックスを最初に 'df.reset_index（inplace = True）'にリセットする必要があります。 – Romain

パンダのデータフレーム - LIKE「DF」と呼ばれるパンダのデータフレームである条件

答えて

関連する問題