Python、web scraping：ネストされたループが機能しない

-1

変数jのネストされたループが機能しません。デバッガは、変数が適切に初期化されているように見える前に、変数をスキップします。あなたが欲しいものを出力Python、web scraping：ネストされたループが機能しない

from urllib.request import Request, urlopen 
# Get beautifulsoup4 with: pip install beautifulsoup4 
import bs4 
import pdb 
import sys 
import json 

site = "http://bgp.he.net/report/world" 
hdr = {'User-Agent': 'Mozilla/5.0'} 
req = Request(site,headers=hdr) 
page = urlopen(req) 
soup = bs4.BeautifulSoup(page, 'html.parser') 

for t in soup.find_all('td', class_='centeralign'): 
    s = str(t.string) 
    if s != "None": 
     print (s.strip()) 
     site2 = "http://bgp.he.net/country/" + s.strip() 
     req = Request(site2,headers=hdr) 
     soup2 = bs4.BeautifulSoup(page, 'html.parser') 

    for j in soup2.find_all('td'): 
     s2 = str(j.string) 
     print (j.strip())

出典

2017-07-28 Jeremy Villa

？ – Gahan

同じページを何度も何度も解析しようとしています。 – Gahan

[bs4を使用してテーブルのヘッダを除くテーブルから情報を抽出する]の可能な複製（https://stackoverflow.com/questions/37635847/extracting-information-from-a-table-except-header-of-the-table） -using-bs4） – stovfl

from urllib.request import Request, urlopen 
# Get beautifulsoup4 with: pip install beautifulsoup4 
import bs4 
import pdb 
import sys 
import json 

site = "http://bgp.he.net/report/world" 
hdr = {'User-Agent': 'Mozilla/5.0'} 
req = Request(site,headers=hdr) 
page = urlopen(req) 
soup = bs4.BeautifulSoup(page, 'html.parser') 

for t in soup.find_all('td', class_='centeralign'): 
    s = str(t.string) 
    if s != "None": 
     print(s.strip()) 
     site2 = "http://bgp.he.net/country/" + s.strip() 
     req2 = Request(site2,headers=hdr) # you missed these two lines 
     page2 = urlopen(req2) 
     soup2 = bs4.BeautifulSoup(page2, 'html.parser') 

     for j in soup2.find_all('td'): 
      s2 = str(j.text) 
      print(s2.strip()) # wrong variable used by you to strip

出典

2017-07-28 13:05:20 Gahan

ありがとう、私はばかみたいだ –

Python、web scraping：ネストされたループが機能しない

答えて

関連する問題