スクリーニングページネーションが機能しない

私はスクリーニングを学びたいと思っています。スクリーニングページネーションが機能しない

# -*- coding: utf-8 -*- 
import scrapy 


class QuotesSpider(scrapy.Spider): 
    name = 'quotes' 
    allowed_domains = ['quotes.toscrape.com/'] 
    start_urls = ['http://quotes.toscrape.com/'] 

    def parse(self, response): 
     quotes = response.xpath('//*[@class="quote"]') 

     for quote in quotes: 
      text = quote.xpath(".//*[@class='text']/text()").extract_first() 
      author = quote.xpath("//*[@itemprop='author']/text()").extract_first() 
      tags = quote.xpath(".//*[@class='tag']/text()").extract(); 

      item = { 
       'author_name':author, 
       'text':text, 
       'tags':tags 
      } 
      yield item 
    next_page_url = response.xpath("//*[@class='next']/a/@href").extract_first() 
    absolute_next_page_url = response.urljoin(next_page_url) 
    yield scrapy.Request(url=absolute_next_page_url,callback=self.parse)

しかし、治療は最初のページのみを解析しています。このコードで何が間違っていますか。私はYouTubeのチュートリアルからそれをコピーしました。

助けてください。

出典

2017-12-19 raju

最初のものを除くすべてのリクエストが「オフサイト」としてフィルタリングされています。これは、allowed_domains値の末尾にこの余分な/があるためです。

allowed_domains = ['quotes.toscrape.com/'] 
        # REMOVE THIS SLASH^

出典

2017-12-19 02:11:48 alecxe

スクリーニングページネーションが機能しない

答えて

関連する問題