文字列の内容に\ nを入れて1行に書き込む

いくつかのHTMLを解析するためのコードは次のとおりです。出力（html結果）を、エスケープされた文字シーケンス（例えば、\n）を含む1行のコードとして保存する必要がありますが、単一引用符または出力があるためにrepr()から使用できない表現を取得しています。その（エスケープシーケンスを解釈する）のように複数行に書き込ま：私は（エスケープシーケンスを含む）が必要です文字列の内容に nを入れて1行に書き込む

<section class="prog__container"> 
<span class="prog__sub">Title</span> 
<p>PEP 336 - Make None Callable</p> 
<span class="prog__sub">Description</span> 
<p> 
<p> 
<code> 
     None 
    </code> 
    should be a callable object that when called with any 
arguments has no side effect and returns 
    <code> 
     None 
    </code> 
    . 
    </p> 
</p> 
</section>

何：

<section class="prog__container">\n <span class="prog__sub">Title</span>\n <p>PEP 336 - Make None Callable</p>\n <span class="prog__sub">Description</span>\n <p>\n <p>\n <code>\n  None\n  </code>\n  should be a callable object that when called with any\n arguments has no side effect and returns\n  <code>\n  None\n  </code>\n  .\n </p>\n </p>\n </section>

マイコード

soup = BeautifulSoup(html, "html.parser") 

for match in soup.findAll(['div']): 
    match.unwrap() 

for match in soup.findAll(['a']): 
    match.unwrap() 

html = soup.contents[0] 
html = str(html) 
html = html.splitlines(True) 
html = " ".join(html) 
html = re.sub(re.compile("\n"), "\\n", html) 
html = repl(html) # my current solution works, but unusable

上記は私の解決ですが、オブジェクト表現は良くありません。文字列表現が必要です。どうすればこれを達成できますか？

出典

2017-01-12 lkdjf0293

であり、理由だけではなく、reprを使用していませんか？

a = """this is the first line 
this is the second line""" 
print repr(a)

、あるいは（もし私リテラル引用なしで正確な出力のあなたの問題でクリア）

print repr(a).strip("'")

出力：これは1つの文字列としてあなたのhtmlを与える

'this is the first line\nthis is the second line' 
this is the first line\nthis is the second line

出典

2017-01-12 15:38:19

これは動作します。最も簡単な解決策として受け入れる – lkdjf0293

import bs4 

html = '''<section class="prog__container"> 
<span class="prog__sub">Title</span> 
<p>PEP 336 - Make None Callable</p> 
<span class="prog__sub">Description</span> 
<p> 
<p> 
<code> 
     None 
    </code> 
    should be a callable object that when called with any 
arguments has no side effect and returns 
    <code> 
     None 
    </code> 
    . 
    </p> 
</p> 
</section>''' 
soup = bs4.BeautifulSoup(html, 'lxml') 
str(soup)

アウト：

'<html><body><section class="prog__container">\n<span class="prog__sub">Title</span>\n<p>PEP 336 - Make None Callable</p>\n<span class="prog__sub">Description</span>\n<p>\n</p><p>\n<code>\n  None\n  </code>\n  should be a callable object that when called with any\n arguments has no side effect and returns\n  <code>\n  None\n  </code>\n  .\n </p>\n</section></body></html>'

出力に、より複雑な方法は、htmlコードがDocument

出典

2017-01-12 15:26:48

ありがとうfoあなたの答え！単一引用符に関して 'repr（）'関数を使うのと同じ問題がここにあります。 – lkdjf0293

from bs4 import BeautifulSoup 
import urllib.request 

r = urllib.request.urlopen('https://www.example.com') 
soup = BeautifulSoup(r.read(), 'html.parser') 
html = str(soup)

と\ nで区切られた行\

出典

2017-01-12 15:53:25 wolfcubman

文字列の内容に\ nを入れて1行に書き込む

答えて

関連する問題