私の問題は何らかの理由でスパイダーがフレームワークによって見つからないように見えることです。私はこのクモを得た:アイテムをインポートすると、私のスパイダーは無駄になりますか?
import scrapy
from scrapy.http import Request
from propreties.items import htmltableitem
class SymbolspiderSpider(scrapy.Spider):
name = "symbolspider"
def start_requests(self):
for i in range(0,10):
yield Request('https://www.google.com/finance?q=%27&restype=company&noIL=1&num=50&ei=VPBjWJHKK9S7U6_dmvgM&start='+str(i))
def parse(self, response):
l=ItemLoader(item=htmltableitem(), response=response)
l.add_xpath('htmltable', ".//*[@id='gf-viewc']/div/div[2]/form/table/tbody/child::*")
return l.load_item()
私がscrapy crawl symbolspider -o output.csv
それエラーを実行すると:
Traceback (most recent call last):
File "/usr/bin/scrapy", line 11, in <module>
sys.exit(execute())
File "/usr/lib/python3.5/site-packages/scrapy/cmdline.py", line 142, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
File "/usr/lib/python3.5/site-packages/scrapy/cmdline.py", line 88, in _run_print_help
func(*a, **kw)
File "/usr/lib/python3.5/site-packages/scrapy/cmdline.py", line 149, in _run_command
cmd.run(args, opts)
File "/usr/lib/python3.5/site-packages/scrapy/commands/crawl.py", line 57, in run
self.crawler_process.crawl(spname, **opts.spargs)
File "/usr/lib/python3.5/site-packages/scrapy/crawler.py", line 162, in crawl
crawler = self.create_crawler(crawler_or_spidercls)
File "/usr/lib/python3.5/site-packages/scrapy/crawler.py", line 190, in create_crawler
return self._create_crawler(crawler_or_spidercls)
File "/usr/lib/python3.5/site-packages/scrapy/crawler.py", line 194, in _create_crawler
spidercls = self.spider_loader.load(spidercls)
File "/usr/lib/python3.5/site-packages/scrapy/spiderloader.py", line 51, in load
raise KeyError("Spider not found: {}".format(spider_name))
KeyError: 'Spider not found: symbolspider'
おかしい事は、私はラインfrom propreties.items import htmltableitem
を削除すると、それは今クモを検出しているが、単なる事実のにエラーが生成さアイテム呼び出しが不明であることを示します。何が起こっている ?
編集:scrapy list
戻り
/usr/lib/python3.5/site-packages/scrapy/spiderloader.py:37: RuntimeWarning:
Traceback (most recent call last):
File "/usr/lib/python3.5/site-packages/scrapy/spiderloader.py", line 31, in _load_all_spiders
for module in walk_modules(name):
File "/usr/lib/python3.5/site-packages/scrapy/utils/misc.py", line 71, in walk_modules
submod = import_module(fullpath)
File "/usr/lib/python3.5/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 986, in _gcd_import
File "<frozen importlib._bootstrap>", line 969, in _find_and_load
File "<frozen importlib._bootstrap>", line 958, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 673, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 665, in exec_module
File "<frozen importlib._bootstrap>", line 222, in _call_with_frames_removed
File "/home/volt/projects/scrapy/googlefinance/googlefinance/spiders/symbolspider.py", line 4, in <module>
from propreties.items import htmltableitem
ImportError: No module named 'propreties'
Could not load spiders from module 'googlefinance.spiders'. Check SPIDER_MODULES setting
warnings.warn(msg, RuntimeWarning)
そしてtree
:
├── googlefinance
│ ├── __init__.py
│ ├── items.py
│ ├── middlewares.py
│ ├── pipelines.py
│ ├── __pycache__
│ │ ├── __init__.cpython-35.pyc
│ │ └── settings.cpython-35.pyc
│ ├── settings.py
│ └── spiders
│ ├── dataspider.py
│ ├── __init__.py
│ ├── __pycache__
│ │ ├── dataspider.cpython-35.pyc
│ │ ├── __init__.cpython-35.pyc
│ │ └── symbolspider.cpython-35.pyc
│ └── symbolspider.py
├── logs
├── output
│ └── htmltables.csv
└── scrapy.cfg
'scrapy list'コマンドは何を返しますか? 'htmltableLoader'のソースも投稿できますか?プロジェクトディレクトリツリーだけでなく、 – Granitosaurus
ディレクトリに 'propreties.py'ファイルがありません。 –
@CarlosPeña 'propreties.py'とは何ですか?私は' startproject'でプロジェクトを生成し、 'genspider'で2つのクモを生成しました。 – ChiseledAbs