Beautifulsoupは属性 "class"のリストを返しますが、他の属性の値は

Beautifulsoupです。これはPythonのhtml解析に便利です。Beautifulsoupは属性 "class"のリストを返しますが、他の属性の値は

from bs4 import BeautifulSoup 
tr =""" 
<table> 
    <tr class="passed" id="row1"><td>t1</td></tr> 
    <tr class="failed" id="row2"><td>t2</td></tr> 
</table> 
""" 
table = BeautifulSoup(tr,"html.parser") 
for row in table.findAll("tr"): 
    print row["class"] 
    print row["id"]

結果：

[u'passed'] 
row1 
[u'failed'] 
row2

なぜ属性配列としてclassリターン？ idは正常値ですか？

beautifulsoup4-4.5.0はpython 2.7

出典

2016-07-26 Larry Cai

classで使用されてBeautifulSoupで特別multi-valued attributeです：

HTML 4は、複数の値を持つことができ、いくつかの属性を定義します。 HTML 5 はそれらのいくつかを削除しますが、さらにいくつかを定義します。あなたが定期的に適用したいとき、例えば - 最も一般的な複数の値を持つ属性は、class（つまり、複数の CSSクラスを持つことができますタグ）時には

は、これに対処することは問題です全体としてclass属性値に式：

BeautifulSoup returns empty list when searching by compound class names

あなたはturn this behavior off by tweaking the tree builderことができますが、私はそれをやってお勧めしません。

出典

2016-07-26 14:26:29 alecxe

要素には複数のクラスがある可能性があります。 @alecxeから受け入れ答えに、私はclass`はhtmlとBS4で特別な属性である `気づき、迅速な答えを

BS4インポートBeautifulSoupから

tr =""" 
<table> 
    <tr class="passed a b c" id="row1"><td>t1</td></tr> 
    <tr class="failed" id="row2"><td>t2</td></tr> 
</table> 
""" 
table = BeautifulSoup(tr,"html.parser") 
for row in table.findAll("tr"): 
    print row["class"] 
    print row["id"] 

['passed', 'a', 'b', 'c'] 
row1 
['failed'] 
row2

出典

2016-07-26 14:26:32 DeepSpace

感謝：

は、この例を考えてみましょう –

Beautifulsoupは属性 "class"のリストを返しますが、他の属性の値は

答えて

関連する問題