BS4 select（）メソッド

これは、与えられた単語のための便利なコンテキスト文を提供wordinastence.comから必要な情報をクロールし、解析するために私のコードです：私は指定されたメソッドを実行した場合BS4 select（）メソッド

#first import request to crawl the html from the target page 
#this case the website is http://www,wordinasentence.com 

import requests 

target = input("The word you want to search : ") 

res = requests.get("https://wordsinasentence.com/"+ target+"-in-a-sentence/") 

#further, put this in so that res_process malfunction could flag the errors 
try: 
    res.raise_for_status() 
except Exception as e: 
    print("There's a problem while connecting to a wordsinasentence sever:", e) 

#it's a unreadable information, so that we needs to parse it to make it readable. 
## use the beautifulsoup to make it readable 

import bs4 
html_soup = bs4.BeautifulSoup(res.text, 'html.parser') 

#check it has been well parsed 
#now we'll extract the Defintion of target 

keywords = html_soup.select('Definition')

は '（定義のそれぞれで独立し選択します「）、それは空のリストにもhtml_soup変数をプリントアウトし次のように何も返さない保つ：

<p onclick='responsiveVoice.speak("not done for any particular reason; chosen or done at random");' style="font-weight: bold; font-family:Arial; font-size:20px; color:#504A4B;padding-bottom:0px;">Definition of Arbitrary</p> 

[]

可能な問題でしょうか？

出典

2017-11-09 Quanlisp

問題は、テキストを見つけるために間違った方法を使用していることです（select()はCSSセレクタ用に作られています）。 keyword stringとfind_allを使用して、探しているタグを選択する機能を使用できます。ところで

def has_text_def(s):  
    return s and s.startswith('Definition of') 

definitions = soup.find_all('p', string=has_text_def)

、あなたはnext element in the tree (with next_sibling)の定義にアクセスするために取得する必要があります：

for p in definitions: 
    print(p.next_sibling.next_sibling.text)

出典

2017-11-09 20:05:32 PRMoureu

答えて

関連する問題