int型 - Unixのタイムエポック

に日時を変換する文字列型のエラー間、私はUnixの時間エポックに日時を変換しようとしていますが、私は、次のエラーを取得しています。int型 - Unixのタイムエポック

入力：

userid,datetime,latitude,longitude 
156,2014-02-01 00:00:00.739166+01,41.8836718276551,12.4877775603346 
187,2014-02-01 00:00:01.148457+01,41.9285433333333,12.4690366666667 
297,2014-02-01 00:00:01.220066+01,41.8910686119733,12.4927045625339 
89,2014-02-01 00:00:01.470854+01,41.7931766914244,12.4321219603157 
79,2014-02-01 00:00:01.631136+01,41.90027472,12.46274618 
191,2014-02-01 00:00:02.048546+01,41.8523047579646,12.5774065771898 
343,2014-02-01 00:00:02.647839+01,41.8921718255185,12.4696996165151 
341,2014-02-01 00:00:02.709888+01,41.9102125627332,12.4770004336041 
260,2014-02-01 00:00:03.458195+01,41.8658208551143,12.4655221109313

プログラム：のpython 2.7ではなく、私は、Python 3.xのアナコンダにアップグレードしたことを、私は結果を得ることができないのです時には、上記のプログラムがうまく働いた

import pandas as pd 
import numpy as np 
import io 

df = pd.read_csv('input.csv', 
       #header=None, #no header in csv 
       header=['userid','datetime','latitude','longitude'], #set custom column names 
       parse_dates=['datetime']) #parse columns d, e to datetime 

df['datetime'] = df['datetime'].astype(np.int64) // 10**9 
#df['e'] = df['e'].astype(np.int64) // 10**9 

df.to_csv('output.csv', header=True, index=False)

エラー：

File "pandas\parser.pyx", line 519, in pandas.parser.TextReader.__cinit__ (pandas\parser.c:5907) 

TypeError: Can't convert 'int' object to str implicitly

編集：入力ファイルhere

出典

2017-05-15 Sitz Blogz

pd.read_csvでheader引数がintまたはリストintのない文字列のリストを期待しています。

from io import StringIO 
file=""" 
userid,datetime,latitude,longitude 
156,2014-02-01 00:00:00.739166+01,41.8836718276551,12.4877775603346 
187,2014-02-01 00:00:01.148457+01,41.9285433333333,12.4690366666667 
297,2014-02-01 00:00:01.220066+01,41.8910686119733,12.4927045625339 
89,2014-02-01 00:00:01.470854+01,41.7931766914244,12.4321219603157 
79,2014-02-01 00:00:01.631136+01,41.90027472,12.46274618 
191,2014-02-01 00:00:02.048546+01,41.8523047579646,12.5774065771898 
343,2014-02-01 00:00:02.647839+01,41.8921718255185,12.4696996165151 
341,2014-02-01 00:00:02.709888+01,41.9102125627332,12.4770004336041 
260,2014-02-01 00:00:03.458195+01,41.8658208551143,12.4655221109313"""

のは、このread_csv声明を試してみましょう：

df = pd.read_csv(StringIO(file),parse_dates=['datetime']) 
df['datetime'] = df['datetime'].astype(np.int64) // 10**9 

print(df.head())

出力：CSVファイルが何のヘッダを持っていない場合は

userid datetime latitude longitude 
0  156 1391209200 41.883672 12.487778 
1  187 1391209201 41.928543 12.469037 
2  297 1391209201 41.891069 12.492705 
3  89 1391209201 41.793177 12.432122 
4  79 1391209201 41.900275 12.462746

出典

2017-05-15 05:03:58

ありがとうございます。次のエラーが表示されます。 'ValueError：' datetime 'がリストにありません。csvファイルの同じ入力がありません。 –

CSVファイルの最初の3行はここに貼り付けられますか？ –

私はそこからダウンロードできるリンクとして入力ファイルを与えました –

は、必要なパラメータnamesと[1]とparse_datesある - datetimeに2つ目の列を解析してみてください：

import pandas as pd 
import numpy as np 
from pandas.compat import StringIO 

temp=u"""156,2014-02-01 00:00:00.739166+01,41.8836718276551,12.4877775603346 
187,1014-02-01 00:00:01.148457+01,41.9285433333333,12.4690366666667 
297,2014-02-01 00:00:01.220066+01,41.8910686119733,12.4927045625339 
89,2014-02-01 00:00:01.470854+01,41.7931766914244,12.4321219603157 
79,2014-02-01 00:00:01.631136+01,41.90027472,12.46274618 
191,2014-02-01 00:00:02.048546+01,41.8523047579646,12.5774065771898 
343,2014-02-01 00:00:02.647839+01,41.8921718255185,12.4696996165151 
341,2014-02-01 00:00:02.709888+01,41.9102125627332,12.4770004336041 
260,2014-02-01 00:00:03.458195+01,41.8658208551143,12.4655221109313""" 
#after testing replace 'StringIO(temp)' to 'filename.csv' 
df = pd.read_csv(StringIO(temp), 
       parse_dates=[1], 
       names=['userid','datetime','latitude','longitude']) 
#print (df) 

#check dtypes if datetime it is OK 
print (df['datetime'].dtypes) 
datetime64[ns]

df['datetime'] = df['datetime'].astype(np.int64) // 10**9 
print (df) 
    userid datetime latitude longitude 
0  156 1391209200 41.883672 12.487778 
1  187 1391209201 41.928543 12.469037 
2  297 1391209201 41.891069 12.492705 
3  89 1391209201 41.793177 12.432122 
4  79 1391209201 41.900275 12.462746 
5  191 1391209202 41.852305 12.577407 
6  343 1391209202 41.892172 12.469700 
7  341 1391209202 41.910213 12.477000 
8  260 1391209203 41.865821 12.465522

別の可能性のある問題は、私のサンプル2行目、悪いデータである：

import pandas as pd 
from pandas.compat import StringIO 

temp=u"""156,2014-02-01 00:00:00.739166+01,41.8836718276551,12.4877775603346 
187,1014-02-01 00:00:01.148457+01,41.9285433333333,12.4690366666667 
297,2014-02-01 00:00:01.220066+01,41.8910686119733,12.4927045625339 
89,2014-02-01 00:00:01.470854+01,41.7931766914244,12.4321219603157 
79,2014-02-01 00:00:01.631136+01,41.90027472,12.46274618 
191,2014-02-01 00:00:02.048546+01,41.8523047579646,12.5774065771898 
343,2014-02-01 00:00:02.647839+01,41.8921718255185,12.4696996165151 
341,2014-02-01 00:00:02.709888+01,41.9102125627332,12.4770004336041 
260,2014-02-01 00:00:03.458195+01,41.8658208551143,12.4655221109313""" 
#after testing replace 'StringIO(temp)' to 'filename.csv' 
df = pd.read_csv(StringIO(temp), 
       parse_dates=[1], 
       names=['userid','datetime','latitude','longitude']) 

#print (df) 

#check dtypes - parse failed, get object dtype 
print (df['datetime'].dtypes) 
object

解析to_datetimeとパラメータerrors='coerce'でdatetime型にする - それはNaTに不良データを交換し、その後、いくつかのNATを置き換えます値fillnaと0（1970-01-01 00:00:00.000000）：

df['datetime'] = pd.to_datetime(df['datetime'], errors='coerce').fillna(0) 
print (df) 
    userid     datetime latitude longitude 
0  156 2014-01-31 23:00:00.739166 41.883672 12.487778 
1  187 1970-01-01 00:00:00.000000 41.928543 12.469037 
2  297 2014-01-31 23:00:01.220066 41.891069 12.492705 
3  89 2014-01-31 23:00:01.470854 41.793177 12.432122 
4  79 2014-01-31 23:00:01.631136 41.900275 12.462746 
5  191 2014-01-31 23:00:02.048546 41.852305 12.577407 
6  343 2014-01-31 23:00:02.647839 41.892172 12.469700 
7  341 2014-01-31 23:00:02.709888 41.910213 12.477000 
8  260 2014-01-31 23:00:03.458195 41.865821 12.465522 


df['datetime'] = df['datetime'].astype(np.int64) // 10**9 
print (df) 
    userid datetime latitude longitude 
0  156 1391209200 41.883672 12.487778 
1  187   0 41.928543 12.469037 
2  297 1391209201 41.891069 12.492705 
3  89 1391209201 41.793177 12.432122 
4  79 1391209201 41.900275 12.462746 
5  191 1391209202 41.852305 12.577407 
6  343 1391209202 41.892172 12.469700 
7  341 1391209202 41.910213 12.477000 
8  260 1391209203 41.865821 12.465522

EDIT：

がある場合も、ヘッダーと必要header=0がread_csvに追加する列名を置き換える必要があります。

出典

2017-05-15 05:35:32 jezrael

ありがとうございました..これは素晴らしいです！しかし、私は1つの答えだけを受け入れることができます。あなたはこの問題で私を助けてくれると思いますか？2.xから3.xへの移行は多くの変更が行われると感じています.. http://stackoverflow.com/questions/43970972/typeerror-unsupported-operand-types-for- str-and-str-in-python-3-x-anac/43971336＃43971336 –

はい、答えが受け入れられるのはあなたの判断です。 – jezrael

2番目の問題 - コードのどの行でエラーが返されますか？ – jezrael

int型 - Unixのタイムエポック

答えて

関連する問題