2016-06-13 7 views
1

booking.comからホテル料金を払うことはありません beautifulsoup4を使用してクラスを検索している間に空リストが返される理由を理解できません。私のコードはここに書かれています。booking.comからホテル料金を取得できません

import webbrowser, requests 
from bs4 import BeautifulSoup 


res = requests.get("http://www.booking.com/searchresults.html?label=gen173nr-1FCAEoggJCAlhYSDNiBW5vcmVmaGyIAQGYATG4AQjIAQzYAQHoAQH4AQKoAgM&sid=c24fad210186ae699e89a0d3cab10039&dcid=4&checkin_monthday=18&checkin_year_month=2016-6&checkout_monthday=19&checkout_year_month=2016-6&class_interval=1&dest_id=-2092511&dest_type=city&group_adults=2&group_children=0&hlrd=0&label_click=undef&nflt=ht_id%3D204%3B&no_rooms=1&review_score_group=empty&room1=A%2CA&sb_price_type=total&sb_travel_purpose=business&score_min=0&src_elem=sb&ss=Kolkata%2C%20West%20Bengal%2C%20India&ss_raw=kolka&ssb=empty&order=score") 
res.status_code 
soup = BeautifulSoup(res.text,"lxml") 
name = [] 
rating = [] 

hotel_name = soup.select('.sr-hotel__name') 
hotel_price = soup.select('tr', class_='roomPrice') 
hotel_rating = soup.select('.js--hp-scorecard-scoreval') 

print hotel_price 
for i in range(0, 10): 
    name.append(hotel_name[i].contents[0]) 
    rating.append(hotel_rating[i].contents[0]) 
    #print name[i] 
    #print rating[i] 

答えて

2

私は1は、セレクタ、2.変更をユーザーエージェントを追加掻き取っソースはあなたが右のブラウザでソースを表示をクリックして選ぶ際にあなたが見るものに実際に異なっている、二つのことをしなければなりませんでした:また、あなたの選択soup.select('tr', class_='roomPrice')の構文が間違っている

In [7]: head = {"User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36"} 

In [8]: url = "http://www.booking.com/searchresults.html?label=gen173nr-1FCAEoggJCAlhYSDNiBW5vcmVmaGyIAQGYATG4AQjIAQzYAQHoAQH4AQKoAgM&sid=c24fad210186ae699e89a0d3cab10039&dcid=4&checkin_monthday=18&checkin_year_month=2016-6&checkout_monthday=19&checkout_year_month=2016-6&class_interval=1&dest_id=-2092511&dest_type=city&group_adults=2&group_children=0&hlrd=0&label_click=undef&nflt=ht_id%3D204%3B&no_rooms=1&review_score_group=empty&room1=A%2CA&sb_price_type=total&sb_travel_purpose=business&score_min=0&src_elem=sb&ss=Kolkata%2C%20West%20Bengal%2C%20India&ss_raw=kolka&ssb=empty&order=score" 

In [9]: res = requests.get(url, headers=head) 

In [10]: soup = BeautifulSoup(res.text,"html.parser") 

In [11]: hotels = soup.select("#hotellist_inner div.sr_item.sr_item_new") 

In [12]: for hotel in hotels: 
    ....:   name = hotel.select_one("span.sr-hotel__name").text.strip() ....:   print(name) 
    ....:   score = hotel.select_one("span.average.js--hp-scorecard-scoreval") 
    ....:   print(score.text.strip()) 
    ....:   price = hotel.select_one("table div.sr-prc--num.sr-prc--final") 
    ....:   print(price.text.strip() if price else "Unavailable") 
    ....:  
The Oberoi Grand Kolkata 
9.0 
€ 113 
Taj Bengal 
9.0 
€ 113 
Sapphire Suites 
7.4 
Unavailable 
The Gateway Hotel EM Bypass Kolkata 
8.6 
€ 84 
The Lalit Great Eastern Kolkata 
8.6 
€ 101 
Swissôtel Kolkata 
8.5 
€ 86 
Kenilworth Hotel 
8.5 
€ 78 
The Fern Residency Kolkata 
8.4 
€ 84 
ITC Sonar Kolkata A Luxury Collection Hotel 
8.3 
€ 116 
Hyatt Regency 
8.3 
€ 63 
Treebo Platinum 
8.2 
€ 38 
The Corner Courtyard 
8.2 
€ 73 
Jameson Inn Shiraz 
8.0 
€ 58 
The Sonnet 
7.9 
€ 80 
Hotel Casa Fortuna 
7.9 
€ 56 
Pipal Tree Hotel 
7.9 
€ 77 

、それはsoup.select('tr.roomPrice')だろう。

しかし、あなたがページに行けば、出力以上、実際には、私たちが何をすべきか、スコアで注文しないベースURLを使用してのparamsを渡すことです:

In [20]: params = {'checkin_year_month':'2016-6', 
    ....: 'checkout_monthday':'19', 
    ....: 'checkout_year_month':'2016-6', 
    ....: 'class_interval':'1', 
    ....: 'dest_id':'-2092511', 
    ....: 'dest_type':'city', 
    ....: 'dtdisc':'0', 
    ....: 'group_adults':'2', 
    ....: 'group_children':'0', 
    ....: 'hlrd':'0', 
    ....: 'hyb_red':'0', 
    ....: 'inac':'0', 
    ....: 'label_click':'undef', 
    ....: 'nflt':'ht_id=204;', 
    ....: 'nha_red':'0', 
    ....: 'no_rooms':'1', 
    ....: 'offset':'0', 
    ....: 'order':'score', 
    ....: 'postcard':'0', 
    ....: 'redirected_from_city':'0', 
    ....: 'redirected_from_landmark':'0', 
    ....: 'redirected_from_region':'0', 
    ....: 'review_score_group':'empty', 
    ....: 'room1':'A,A', 
    ....: 'sb_price_type':'total', 
    ....: 'sb_travel_purpose':'business', 
    ....: 'score_min':'0', 
    ....: 'src_elem':'sb', 
    ....: 'ss':'Kolkata, West Bengal, India', 
    ....: 'ss_all':'0', 
    ....: 'ss_raw':'kolka', 
    ....: 'ssb':'empty', 
    ....: 'sshis':'0'} 

In [21]: head = {"User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36"} 

In [22]: url = "http://www.booking.com/searchresults.html" 

In [23]: res = requests.get(url, params=params, headers=head) 

In [24]: soup = BeautifulSoup(res.text,"html.parser") 

In [25]: hotels = soup.select("#hotellist_inner div.sr_item.sr_item_new") 

In [26]: for hotel in hotels: 
    ....:   name = hotel.select_one("span.sr-hotel__name").text.strip() ....:   print(name) 
    ....:   score = hotel.select_one("span.average.js--hp-scorecard-scoreval") 
    ....:   print(score.text.strip()) 
    ....:   price = hotel.select_one("table div.sr-prc--num.sr-prc--final") 
    ....:   print(price.text.strip() if price else "Unavailable") 
    ....:  
The Oberoi Grand Kolkata 
9.0 
Unavailable 
Taj Bengal 
9.0 
Unavailable 
The Lalit Great Eastern Kolkata 
8.6 
Unavailable 
The Gateway Hotel EM Bypass Kolkata 
8.6 
Unavailable 
Swissôtel Kolkata 
8.5 
Unavailable 
Kenilworth Hotel 
8.5 
Unavailable 
The Fern Residency Kolkata 
8.4 
Unavailable 
ITC Sonar Kolkata A Luxury Collection Hotel 
8.3 
Unavailable 
Hyatt Regency 
8.3 
Unavailable 
Treebo Platinum 
8.2 
Unavailable 
The Corner Courtyard 
8.2 
Unavailable 
Monovilla Inn 
8.1 
Unavailable 
Jameson Inn Shiraz 
8.0 
Unavailable 
The Sonnet 
7.9 
Unavailable 
Hotel Casa Fortuna 
7.9 
Unavailable 

hereどこ価格を使用します私たちがもう少し論理を追加する必要があるように隠されているので、私はちょっと答えを編集します。

+0

Yuppそれは問題の価格が利用できないことを示しています。それで助けてください。 – sumitroy

関連する問題