Python：2行の空白があるファイルから行を読み取る方法

以下のような形式のファイルを読み込もうとしています。各行の間に2つの '\ n'スペースがあります。私は2つを使用しているのでPython：2行の空白があるファイルから行を読み取る方法

0 I was very inspired by Louise's Hay approach t... 
1 \n You Can Heal Your Life by 
2 \n I had an older version 
3 \n I love Louise Hay and 
4 \n I thought the book was exellent

：

open_reviews = open("C:\\Downloads\\review_short.txt","r",encoding="Latin-1").read() 
documents = [] 
for r in open_reviews.split('\n\n'): 
    documents.append(r) 

df = pd.DataFrame(documents) 
print(df.head())

私は取得しています出力は以下の通りです：

Great tool for healing your life--if you are ready to change your beliefs!<br /><a href="http 


Bought this book for a friend. I read it years ago and it is one of those books you keep forever. Love it! 


I read this book many years ago and have heard Louise Hay speak a couple of times. It is a valuable read...

私はラインを読み、データフレームに変換するPythonコードの下に使用しています（\ n）、各行の先頭に追加されます。これを処理する方法は他にありますので、次のように出力します：

0 I was very inspired by Louise's Hay approach t... 
1 You Can Heal Your Life by 
2 I had an older version 
3 I love Louise Hay and 
4 I thought the book was exellent

出典

2016-04-16 Sriram Chandramouli

.stip（）メソッドを使用してみてください。文字列の先頭または末尾にある不要な空白文字を削除します。

あなたはこのようにそれを使用することができます：

for r in open_review.split('\n\n'): 
    documents.append(r.strip())

出典

2016-04-16 19:29:37 colelemonz

ありがとうございます。それはうまくいった。 –

使用readlines()とstrip()とラインをきれいに。

filename = "C:\\Downloads\\review_short.txt" 
open_reviews = open(filename, "r", encoding="Latin-1") 
documents = [] 
for r in open_reviews.readlines(): 
    r = r.strip() # clean spaces and \n 
    if r: 
     documents.append(r)

出典

2016-04-16 19:32:04 iurisilvio

これは空白でない行をすべて追加します。

filename = "..." 
lines = [] 
with open(filename) as f: 
    for line in f: 
     line = line.strip() 
     if line: 
      lines.append(line) 

>>> lines 
['Great tool for healing your life--if you are ready to change your beliefs!<br /><a href="http', 
'Bought this book for a friend. I read it years ago and it is one of those books you keep forever. Love it!', 
'I read this book many years ago and have heard Louise Hay speak a couple of times. It is a valuable read...'] 

lines = pd.DataFrame(lines, columns=['my_text']) 
>>> lines 
              my_text 
0 Great tool for healing your life--if you are r... 
1 Bought this book for a friend. I read it years... 
2 I read this book many years ago and have heard...

出典

2016-04-16 19:46:43 Alexander

この答えは長いですが、forループのファイル全体では読み込まれないのでより好ましいです。 'readlines（）'または 'strip（）'を呼び出すとファイル全体がメモリに読み込まれます。 – nighthawk454

しかし、これはファイルのサイズを知っていない方が良い方法です。 – WoodChopper

ありがとう..私はそれが働いた –

Python：2行の空白があるファイルから行を読み取る方法

答えて

関連する問題