BeautifulSoupを使用してwhoscall.inで問題を掻く4

PythonスクリプトBeautifulSoupを使用して、ページ上のdivの単語を把握できないようですが、これには特別な理由がありますか？私はプロフィールの画像をつかんで、メッセージの数を数えることができますが、テキストそのものではありません。BeautifulSoupを使用してwhoscall.inで問題を掻く4

（参考のために、私はこのページで使用している：http://whoscall.in/1/2392247496/）を

if(website == "1"): 
    reqInput = "http://whoscall.in/1/%s/" % (teleWho) 
    urlfile = urllib2.Request(reqInput) 
    print (reqInput) 
    time.sleep(1) 
    requestRec = requests.get(reqInput) 
    soup = BeautifulSoup(requestRec.content, "lxml") 
    noMatch = soup.find(text=re.compile(r"no reports yet on the phone number")) 
    print(requestRec.content)# #only if needed# 
    type(noMatch) is str 
    if noMatch is None: 
    worksheet.write(idx+1, 2, "Got a hit") 
    howMany = soup.find_all('img',{'src':'/default-avatar.gif'}) 
    howManyAreThere = len(howMany) 
    worksheet.write(idx+1,1,howManyAreThere) 
    print (howManyAreThere) 
    scamNum = soup.find_all(text=("scam"),recursive=True) 
    #,'scam','Scammer','scammer'# 
    scamCount = len(scamNum) 
    print(scamNum) 
    searchTerms = {scamCount:scamCount} 
    sentiment = max(searchTerms, key=searchTerms.get) 
    worksheet.write(idx+1,3,sentiment)

私はそれを拒否した理由を、私はわからないんだページ

のオフテキスト「詐欺」を引くように見えることはできません他の美味しいスープコードが完全に機能するので、そのテキストを見つけることができます。

https://github.com/GarnetSunset/Haircuttery/

出典

2017-04-21 GarnetSunset

変更この行：

scamNum = soup.find_all(text=("scam"),recursive=True)

へ：

scamNum = [ div.text for div in soup.find_all('div', {'style':'font-size:14px; margin:10px; overflow:hidden'}) if 'scam' in div.text.lower() ]

複数の単語のためにこれを試してみてください：

words = [ 'word1', 'word2', ... ] 
scamNum = [ div.text for div in soup.find_all('div', {'style':'font-size:14px; margin:10px; overflow:hidden'}) if any(word for word in words if word in div.text.lower()) ]

出典

2017-04-21 20:12:22

は完全に働きました。あなたは最高です。 :) – GarnetSunset

ありがとう、うれしい私は助けることができる –

最後のことは、あなたがこれを好きだと思った。 https://github.com/GarnetSunset/Haircuttery/commit/93a12ab60eea9df36339a1378966f8f9fd0ecc78 – GarnetSunset

BeautifulSoupを使用してwhoscall.inで問題を掻く4

答えて

関連する問題