抽出データ - Pythonの - 優秀な図書館

-1

誰かが私を助けることができる私はCSVに460ページから顧客名、カスタマーレビューや時間を抽出したいのpythonで複数のWebページから抽出データ - Pythonの

抽出データ

ファイル。あなたがスクラップをしたいサイトは、常にあなたがセレンを使用することができます同じであればここでは Url

2017-10-27 Asha Anandan

で、より簡単かつ高速ですが、ページのHTMLコードを知っている必要があります。あなたは常に例えば同じページ

を廃棄しなければならないとき、より良いソリューションです：

from selenium import webdriver 

path_to_chromedriver = "C:\\Users\\user\\AppData\\Local\\Programs\\Python\\Python36-32\\chromedriver.exe" # change path as needed 
browser = webdriver.Chrome(executable_path = path_to_chromedriver) 

url = "http://www.mouthshut.com/mobile-operators/Reliance-Jio-reviews-925812061" 
browser.get(url) 

res = browser.find_elements_by_css_selector('div.col-2.profile') 
for item in res: 
    try: 
     user = item.find_element_by_tag_name("a") 
     print(user.get_attribute("href")) 
    except Exception as e: 
     print("ERROR", e)

あなたもこのサイトhttp://selenium-python.readthedocs.io/locating-elements.html

のxpath チェックを使用することができます

出典

2017-10-27 11:25:40 tascio

抽出データ - Pythonの

答えて

関連する問題