2017-11-08 9 views
1

私は以下のコードを持っていましたが、Python 3は縦パイプをUnicode文字として認識していません。Python 3が縦棒文字を認識しない

m_cols = ['movie_id', 'title', 'release_date', 
     'video_release_date', 'imdb_url'] 

    movies = pd.read_csv(
     'http://files.grouplens.org/datasets/movielens/ml-100k/u.item', 
     sep='|', names=m_cols, usecols=range(5)) 

    movies.head() 

と私は可能なこの背後にある理由、そしてどのように私はこの問題を解決することができ何ができるか

UnicodeDecodeError      Traceback (most recent call 
    last) 
    pandas\_libs\parsers.pyx in 
    pandas._libs.parsers.TextReader._convert_tokens 
    (pandas\_libs\parsers.c:14858)() 

    pandas\_libs\parsers.pyx in 
    pandas._libs.parsers.TextReader._convert_with_dtype 
    (pandas\_libs\parsers.c:17119)() 

    pandas\_libs\parsers.pyx in 
    pandas._libs.parsers.TextReader._string_convert 
    (pandas\_libs\parsers.c:17347)() 

    pandas\_libs\parsers.pyx in pandas._libs.parsers._string_box_utf8 
    (pandas\_libs\parsers.c:23041)() 

    UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 3: 
    invalid continuation byte 

    During handling of the above exception, another exception occurred: 

    UnicodeDecodeError      Traceback (most recent call 
    last) 
    <ipython-input-15-72a8222212c1> in <module>() 
    4 movies = pd.read_csv(
    5  'http://files.grouplens.org/datasets/movielens/ml-100k/u.item', 
    ----> 6  sep='|', names=m_cols, usecols=range(5)) 
    7 
    8 movies.head() 

次のエラーを取得しますか? python3で

+0

おそらくhttps://stackoverflow.com/questions/28947607/ascii-codec-cant-decode-byte-0xe9関連 – svgrafov

答えて

1

encoding="latin-1"を使用する:

In [9]: movies = pd.read_csv(
     'http://files.grouplens.org/datasets/movielens/ml-100k/u.item', 
     sep='|', names=m_cols, usecols=range(5), header=None, encoding="latin-1") 

In [10]: movies.head() 
Out[10]: 
    movie_id    title release_date video_release_date \ 
0   1 Toy Story (1995) 01-Jan-1995     NaN 
1   2 GoldenEye (1995) 01-Jan-1995     NaN 
2   3 Four Rooms (1995) 01-Jan-1995     NaN 
3   4 Get Shorty (1995) 01-Jan-1995     NaN 
4   5  Copycat (1995) 01-Jan-1995     NaN 

              imdb_url 
0 http://us.imdb.com/M/title-exact?Toy%20Story%2... 
1 http://us.imdb.com/M/title-exact?GoldenEye%20(... 
2 http://us.imdb.com/M/title-exact?Four%20Rooms%... 
3 http://us.imdb.com/M/title-exact?Get%20Shorty%... 
4 http://us.imdb.com/M/title-exact?Copycat%20(1995) 
関連する問題