2016-07-25 13 views
2

私はpandasを使用して、2つの日付間で特定の契約タイプを購入したメンバーのカウントを取得しようとしています。私は似ていると働いているデータフレーム:日付と条件でのパンダのフィルタリング

Member Nbr  Contract-Type Date-Joined 
20   1 Year Membership  2011-08-01 
3128  3 Month Membership  2011-07-22 
3535  4 Month Membership  2015-02-18 
3760  4 Month Membership  2010-02-28 
3762  3 Month Membership  2010-01-31 
3882  1 Month Membership  2010-04-24  
3892  3 Month Membership  2010-03-24  
4116  3 Month Membership  2014-12-02 
4700  1 Month Membership  2014-11-11 
4802  4 Month Membership  2014-07-26 
5004   1 Year Membership  2012-03-12 
5020   1 Year Membership  2010-07-28  
5022  3 Month Membership  2010-06-25  
5130   1 Year Membership  2011-01-04 
         ... 

私は似たようなしようとすると

print(len(df[(df['Date-Joined'] > '2010-01-01') 
      & (df['Date-Joined'] < '2012-02-01') 
      & (df['Member Type'] == '1 Year Membership')])) 

を使用することに興味がある唯一の契約タイプがある場合、私は、カウントを取得することができています次のコード

print(len(df[(df['Date-Joined'] > '2013-01-01') 
     & (df['Date-Joined'] < '2013-02-01') 
     & (df['Member Type'] == '1 Year Membership') 
     or (df['Member Type'] == '4 Month Membership')])) 

1 Year Membershipまたは4 Month Membershipを指定することで、私は次のエラー

を取得します
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). 

&条件によってor条件を置き換えるは0

+5

'または'の代わりに '|'を使います。 –

+1

また、 '&'が '|'よりも優先されるので、ロジックにはもう1つの括弧が必要です。 –

答えて

4

使用|代わりにorを返します。また、&|よりも優先されるため、ロジックにもう1組のかっこが必要です。

import io 
import pandas as pd 

data = io.StringIO('''\ 
Member Nbr,Contract-Type,Date-Joined 
20,1 Year Membership,2011-08-01 
3128,3 Month Membership,2011-07-22 
3535,4 Month Membership,2015-02-18 
3760,4 Month Membership,2010-02-28 
3762,3 Month Membership,2010-01-31 
3882,1 Month Membership,2010-04-24 
3892,3 Month Membership,2010-03-24 
4116,3 Month Membership,2014-12-02 
4700,1 Month Membership,2014-11-11 
4802,4 Month Membership,2014-07-26 
5004,1 Year Membership,2012-03-12 
5020,1 Year Membership,2010-07-28 
5022,3 Month Membership,2010-06-25 
5130,1 Year Membership,2011-01-04 
''') 

df = pd.read_csv(data) 

print(df[ 
    (df['Date-Joined'] > '2010-01-01') & 
    (df['Date-Joined'] < '2012-02-01') & 
    (df['Contract-Type'] == '1 Year Membership') 
    ]) 

#  Member Nbr  Contract-Type Date-Joined 
# 0   20 1 Year Membership  2011-08-01 
# 11  5020 1 Year Membership  2010-07-28 
# 13  5130 1 Year Membership  2011-01-04 

print(df[ 
    (df['Date-Joined'] > '2010-01-01') & 
    (df['Date-Joined'] < '2012-02-01') & 
    (df['Contract-Type'] == '1 Year Membership') | 
    (df['Contract-Type'] == '4 Month Membership') 
    ]) 

#  Member Nbr  Contract-Type Date-Joined 
# 0   20 1 Year Membership  2011-08-01 
# 2   3535 4 Month Membership  2015-02-18 <====== BEWARE! 
# 3   3760 4 Month Membership  2010-02-28 
# 9   4802 4 Month Membership  2014-07-26 <====== BEWARE! 
# 11  5020 1 Year Membership  2010-07-28 
# 13  5130 1 Year Membership  2011-01-04 

print(df[ 
    (df['Date-Joined'] > '2010-01-01') & 
    (df['Date-Joined'] < '2012-02-01') & 
    ((df['Contract-Type'] == '1 Year Membership') | 
    (df['Contract-Type'] == '4 Month Membership')) 
    ]) 

#  Member Nbr  Contract-Type Date-Joined 
# 0   20 1 Year Membership  2011-08-01 
# 3   3760 4 Month Membership  2010-02-28 
# 11  5020 1 Year Membership  2010-07-28 
# 13  5130 1 Year Membership  2011-01-04