Pythonのクローラ：扱う「負荷より」ボタン

私はPythonのクローラーを学んでいると私は、次のURLにある「負荷より」ボタンに対処する方法を知りたい：Pythonのクローラ：扱う「負荷より」ボタン

https://www.photo.net/search/#//Sort-View-Count/All-Categories/All-Time/Page-1

（Iを私が持っている）すべての画像をクロールしようとしている

現在のコードはbeautifulsoupを使用していた。

from urllib.request import * 

from http.cookiejar import CookieJar 

from bs4 import BeautifulSoup 

url = 'https://www.photo.net/search/#//Sort-View-Count/All-Categories/All- Time/Page-1' 

cj = CookieJar() 

opener = build_opener(HTTPCookieProcessor(cj)) 

try: 
    p = opener.open(url) 

    soup = BeautifulSoup(p, 'html.parser') 

except Exception as e: 

    print(str(e))

出典

2017-06-05 Harry

を使用する方法の例です。 URL？いくつかのページをループしてそこにあるものを掻き集めることができます。 – briansrls

ええ、試しましたが、そのボタンをクリックしないとロードできません。 – Harry

まあ、私はあなたのためのソリューションを持っています。

Python用にSeleniumモジュールを試してください。

1）ピップここ

経由でインストール）セレンをクロームドライバ

2をダウンロードすると、あなたが最後にページ番号をインクリメントしようとしたことがあり、それ

from selenium import webdriver 
from selenium.webdriver.support.wait import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC 
from selenium.webdriver.common.by import By 

browser = webdriver.Chrome('Path to chrome driver') 
browser.get() 
while True: 
    button = WebDriverWait(browser,10).until(EC.presence_of_element_located((By.LINK_TEXT, 'Load More'))) 
    button.click()

出典

2017-09-27 15:26:57

Pythonのクローラ：扱う「負荷より」ボタン

答えて

関連する問題