BeautifulSoupのnth-of-type NotImplementedError

私はPythonの初心者です。私はいくつかの調査データを掻き集めるためのWebScraperを実装しようとしています。私は、あなたがサーベイを訪れた場合、つまり、すべての要素を選択するために、nth-of-type CSSセレクタ（BeautifulSoupが使用できる唯一の擬似クラスなので）を使用しようとしています。平均スコアの）。私はhttp://jsfiddle.net/3Ycu9/のセレクタをテストしていて、nth-of-typeと属性セレクタのみを使用していますが、このコードではNotImplementedErrorをスローしています。誰かが私がなぜこのエラーを受けているのか理解してもらえますか？BeautifulSoupのnth-of-type NotImplementedError

import requests, bs4 
res = requests.get('http://www.eecs.umich.edu/eecs/undergraduate/survey/all_survey.2016.htm') 
res.raise_for_status() 
survey = bs4.BeautifulSoup(res.text, "html.parser") 
classes = survey.select('td[colspan=3]') 

# select the 7th <td> element in every <tr> tag 
difficulty = survey.select('td[style*="border-top:none;border-left:none"]:nth-of-type(7)') 

for i in range(len(difficulty)): 
    print(str(difficulty[i].getText()))

出典

2016-08-17 adesotopu

nth-of-type疑似クラスも部分的にサポートされています。あなたが適用した追加の属性条件は好きではありません。この1は、例えば、通過します：

td:nth-of-type(7)

をより理にかなってここに直接tr->td関係のチェックを持つ：

tr > td:nth-of-type(7)

このページのマークアップは、しかし、HTMLの構文解析のためにひどいです。 Average Scoreヘッダ値を持つtd要素を有するもの -

Aわずかに良いアプローチがここで開始行を検索するであろう。その後、私たちは「テーブル」の終わりまで、平均スコアを収集tr兄弟を通過することができます。

start_row = survey.find(lambda tag: tag and tag.name == "td" and "Average" in tag.get_text(strip=True)).find_parent("tr") 

for row in start_row.find_next_siblings("tr"): 
    cells = row.find_all("td") 

    average_score = cells[6].get_text() 
    print(average_score) 

    if not average_score: 
     break

プリント：

出典

2016-08-17 23:25:27 alecxe

BeautifulSoupのnth-of-type NotImplementedError

答えて

関連する問題