2017-03-17 30 views
1

私はpandasgroupbyを使用して、今後の操作のために日付フィールドから月を抽出しようとしています。 40行目では、年月日を抽出するdateutilを適用しようとしています。AttributeError: 'タイムスタンプ'オブジェクトに '読み取り'属性がありません

マイコード:

df = pandas.DataFrame.from_records(defects, columns=headers) 
df['date'] = pandas.to_datetime(df['date'], format="%Y-%m-%d") 
df['date'] = df['date'].apply(dateutil.parser.parse, yearfirst=True) 
.... 
print df.groupby(['month']).groups.keys() 

そして、私は取得しています:

Traceback (most recent call last): 
File "jira-sandbox.py", line 40, in <module> 
defects_df['created'] = defects_df['created'].apply(dateutil.parser.parse, yearfirst=True) 
    File "/Library/Python/2.7/site-packages/pandas/core/series.py", line 2294, in apply 
    mapped = lib.map_infer(values, f, convert=convert_dtype) 
    File "pandas/src/inference.pyx", line 1207, in pandas.lib.map_infer (pandas/lib.c:66124) 
    File "/Library/Python/2.7/site-packages/pandas/core/series.py", line 2282, in <lambda> 
    f = lambda x: func(x, *args, **kwds) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/dateutil/parser.py", line 697, in parse 
    return DEFAULTPARSER.parse(timestr, **kwargs) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/dateutil/parser.py", line 301, in parse 
    res = self._parse(timestr, **kwargs) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/dateutil/parser.py", line 349, in _parse 
    l = _timelex.split(timestr) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/dateutil/parser.py", line 143, in split 
    return list(cls(s)) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/dateutil/parser.py", line 137, in next 
    token = self.get_token() 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/dateutil/parser.py", line 68, in get_token 
    nextchar = self.instream.read(1) 
AttributeError: 'Timestamp' object has no attribute 'read' 

答えて

0

私はあなたがdateutil操作を必要としないと思います。 pandas.to_datetime()コールの後、この列はすでにdatetimeです。 groupby()で使用できる列を作成する方法の1つです。

コード:

# build a test dataframe 
import datetime as dt 
df = pd.DataFrame([dt.datetime.now() + dt.timedelta(days=x*15) 
        for x in range(10)], 
        columns=['date']) 
print(df) 

# add a year/moth column to allow grouping 
df['month'] = df.date.apply(lambda x: x.year * 100 + x.month) 

# show a groupby 
print(df.groupby(['month']).groups.keys()) 

結果:

     date 
0 2017-03-17 14:30:24.344 
1 2017-04-01 14:30:24.344 
2 2017-04-16 14:30:24.344 
3 2017-05-01 14:30:24.344 
4 2017-05-16 14:30:24.344 
5 2017-05-31 14:30:24.344 
6 2017-06-15 14:30:24.344 
7 2017-06-30 14:30:24.344 
8 2017-07-15 14:30:24.344 
9 2017-07-30 14:30:24.344 

[201704, 201705, 201706, 201707, 201703] 
関連する問題