2016-08-18 18 views
1

http://eastasiaeg.com/en/laptop-in-egypt」 から「Scrapy」を使用して製品のリストを削り取ろうとしています。POSTメソッドでリクエストを作成する

製品の一部が動的にロードされ、Scrapy要求を作成しようとしました。 しかし何かが間違っています。 Plsヘルプ。

# -*- coding: utf-8 -*- 
import scrapy 

from v4.items import Product 


class IntelEGEastasiaegComSpider(scrapy.Spider): 
    name = "intel_eg_eastasiaeg_com_py" 

    start_urls = [ 
      'http://eastasiaeg.com/en/laptop-in-egypt' 
     ] 

    def start_requests(self): 
     request_body = {"categoryId":"3","manufacturerId":"0","vendorId":"0","priceRangeFilterModel7Spikes":{"CategoryId":"3","ManufacturerId":"0","VendorId":"0","SelectedPriceRange":{},"MinPrice":"2400","MaxPrice":"44625"},"specificationFiltersModel7Spikes":{"CategoryId":"3","ManufacturerId":"0","VendorId":"0","SpecificationFilterGroups":[{"Id":"27","FilterItems":[{"Id":"103","FilterItemState":"Unchecked"},{"Id":"104","FilterItemState":"Unchecked"},{"Id":"105","FilterItemState":"Unchecked"},{"Id":"110","FilterItemState":"Unchecked"}]},{"Id":"11","FilterItems":[{"Id":"302","FilterItemState":"Unchecked"},{"Id":"75","FilterItemState":"Unchecked"}]},{"Id":"6","FilterItems":[{"Id":"21","FilterItemState":"Unchecked"},{"Id":"24","FilterItemState":"Unchecked"},{"Id":"25","FilterItemState":"Unchecked"},{"Id":"26","FilterItemState":"Unchecked"}]},{"Id":"5","FilterItems":[{"Id":"1069","FilterItemState":"Unchecked"},{"Id":"1078","FilterItemState":"Unchecked"},{"Id":"1118","FilterItemState":"Unchecked"},{"Id":"1862","FilterItemState":"Unchecked"}]},{"Id":"2","FilterItems":[{"Id":"8","FilterItemState":"Unchecked"},{"Id":"10","FilterItemState":"Unchecked"},{"Id":"1451","FilterItemState":"Unchecked"},{"Id":"1119","FilterItemState":"Unchecked"}]},{"Id":"8","FilterItems":[{"Id":"61","FilterItemState":"Unchecked"},{"Id":"62","FilterItemState":"Unchecked"},{"Id":"63","FilterItemState":"Unchecked"}]},{"Id":"333","FilterItems":[{"Id":"2460","FilterItemState":"Unchecked"}]}]},"attributeFiltersModel7Spikes":"null","manufacturerFiltersModel7Spikes":{"CategoryId":"3","ManufacturerFilterItems":[{"Id":"2","FilterItemState":"Unchecked"},{"Id":"1","FilterItemState":"Unchecked"},{"Id":"3","FilterItemState":"Unchecked"},{"Id":"6","FilterItemState":"Unchecked"}]},"vendorFiltersModel7Spikes":"null","pageNumber":"2","orderby":"10","viewmode":"grid","pagesize":"null","queryString":"","shouldNotStartFromFirstPage":"true","onSaleFilterModel":"null","keyword":"","searchCategoryId":"0","searchManufacturerId":"0","priceFrom":"","priceTo":"","includeSubcategories":"False","searchInProductDescriptions":"False","advancedSearch":"False","isOnSearchPage":"False"} 
     for body in request_body: 
      request_body = body 
      yield scrapy.Request('http://eastasiaeg.com/en/getFilteredProducts', 
           method="POST", 
           body=request_body, 
           callback=self.parse, 
           headers={'Content-type': 'application/json; charset=UTF-8'},) 

    def parse(self, response): 
     print response.body 
+0

'start_urls'は' start_requests() 'をオーバーライドしているため、使用していません。これはFYIの問題の原因ではありません。 – Granitosaurus

答えて

2

フォームデータとともにPOST要求を行う場合は、scrapy.FormRequestを使用する必要があります。

def start_requests(self): 
    form_data = {} # your formdata 
    yield scrapy.FormRequest(url, formdata=form_data) 

あなたのアプローチはうまくいくかもしれませんが、あなたのforループはあまり意味がありません。
for body in request_body:は、request_bodyという辞書のキーを反復処理し、基本的に24個のリクエストを本体に1つのキーで作成します。

def start_requests(self): 
    form_data = {} # your formdata 
    # Request only takes string as body so you need to 
    # convert python dict to string 
    request_body = json.dumps(form_data) 
    yield scrapy.Request('http://eastasiaeg.com/en/getFilteredProducts', 
         method="POST", 
         body=request_body, 
         headers={'Content-Type': 'application/json; charset=UTF-8'},) 
    # Usually Content-Type matters here a lot. 

P.S.:だからscrapy.Request試みでそれを行うには
scrapyリクエストはデフォルトでself.parseコールバックに設定されているため、指定する必要はありません。

+0

tnks。生きてる!!! –

関連する問題