Scrapからすでに盗まれたURLの数（request_count）を取得するにはどうすればよいですか？このような

Scrapyショーの統計コードScrapからすでに盗まれたURLの数（request_count）を取得するにはどうすればよいですか？このような

2016-11-18 06:41:38 [scrapy] INFO: Dumping Scrapy stats: 
{'downloader/request_bytes': 656, 
'downloader/request_count': 2, 
'downloader/request_method_count/GET': 2, 
'downloader/response_bytes': 2661, 
'downloader/response_count': 2, 
'downloader/response_status_count/200': 2, 
'finish_reason': 'finished', 
'finish_time': datetime.datetime(2016, 11, 18, 14, 41, 38, 759760), 
'item_scraped_count': 2, 
'log_count/DEBUG': 5, 
'log_count/INFO': 7, 
'response_received_count': 2, 
'scheduler/dequeued': 2, 
'scheduler/dequeued/memory': 2, 
'scheduler/enqueued': 2, 
'scheduler/enqueued/memory': 2, 
'start_time': datetime.datetime(2016, 11, 18, 14, 41, 37, 807590)}

私の目標は、process_responseやスパイダーのいずれかの方法でresponse_countまたはrequest_countにアクセスすることで実行中。

スパイダーがN個のURLを一度抹消すると、スパイダーを閉じたいと考えています。

出典

2016-11-18 Umair

あなたが行われた要求の数に応じて、クモを閉じたい場合は、私が[CLOSESPIDER_PAGECOUNT] settings.pyで使用することをお勧めしたい：それでも場合（https://doc.scrapy.org/en/latest/topics/extensions.html#closespider-pagecount）

settings.py

CLOSESPIDER_PAGECOUNT= 20 # so end after 20 pages have been crawled

あなたはスパイダー内部のScrapy Statsにアクセスしたいと思っていますが、これは次のようにすることができます：

self.crawler.stats.get_value('my_stat_name') # change it to `response_count` or `request_count`

出典

2016-11-19 10:52:39 eLRuLL

Scrapからすでに盗まれたURLの数（request_count）を取得するにはどうすればよいですか？このような

答えて

関連する問題