2017-03-16 12 views
0

このCurlは機能します。Scrapyでユーザー名とパスワードのAPIを使用できません

https://user:[email protected]/v1/convert_from.json/?from=1000000&to=SGD&amount=AED,AUD,BDT&inverse=True 

しかし、このScrapyリクエストは機能しません。

yield scrapy.Request("https://justanalyticspteltd65986537:[email protected]/v1/convert_from.json/?from=1000000&to=SGD&amount=AED,AUD,BDT&inverse=True") 

It returns this error: 

Traceback (most recent call last): 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\twisted\internet\defer.py", line 1297, in _inlineCallbacks 
    result = result.throwExceptionIntoGenerator(g) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\twisted\python\failure.py", line 389, in throwExceptionIntoGenerator 
    return g.throw(self.type, self.value, self.tb) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\scrapy\core\downloader\middleware.py", line 43, in process_request 
    defer.returnValue((yield download_func(request=request,spider=spider))) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\scrapy\utils\defer.py", line 45, in mustbe_deferred 
    result = f(*args, **kw) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\scrapy\core\downloader\handlers\__init__.py", line 65, in download_request 
    return handler.download_request(request, spider) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\scrapy\core\downloader\handlers\http11.py", line 61, in download_request 
    return agent.download_request(request) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\scrapy\core\downloader\handlers\http11.py", line 286, in download_request 
    method, to_bytes(url, encoding='ascii'), headers, bodyproducer) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\twisted\web\client.py", line 1596, in request 
    endpoint = self._getEndpoint(parsedURI) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\twisted\web\client.py", line 1580, in _getEndpoint 
    return self._endpointFactory.endpointForURI(uri) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\twisted\web\client.py", line 1456, in endpointForURI 
    uri.port) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\scrapy\core\downloader\contextfactory.py", line 59, in creatorForNetloc 
    return ScrapyClientTLSOptions(hostname.decode("ascii"), self.getContext()) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\twisted\internet\_sslverify.py", line 1201, in __init__ 
    self._hostnameBytes = _idnaBytes(hostname) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\twisted\internet\_sslverify.py", line 87, in _idnaBytes 
    return idna.encode(text) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\idna\core.py", line 355, in encode 
    result.append(alabel(label)) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\idna\core.py", line 276, in alabel 
    check_label(label) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\idna\core.py", line 253, in check_label 
    raise InvalidCodepoint('Codepoint {0} at position {1} of {2} not allowed'.format(_unot(cp_value), pos+1, repr(label))) 
InvalidCodepoint: Codepoint U+003A at position 28 of u'xxxxxxxxxxxxxxxxxxxxxxxxxxxx:[email protected]' not allowed 
+0

あなたは2番目のURLに資格情報を残しました。コード500は、リクエスト処理中にサーバーでエラーが発生したため、何か問題があることを意味します。 – Granitosaurus

+0

私の質問が更新されます。以前は、クローラを無効にしていませんでした –

答えて

1

治療はURL経由のHTTP認証をサポートしていません。代わりにHTTPAuthMiddlewareを使用する必要があります。 settings.py

クモで
DOWNLOADER_MIDDLEWARES = { 
    'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware': 811, 
} 

from scrapy.spiders import CrawlSpider 

class SomeIntranetSiteSpider(CrawlSpider): 

    http_user = 'someuser' 
    http_pass = 'somepass' 
    name = 'intranet.example.com' 

    # .. rest of the spider code omitted ... 
+0

URLからの資格情報の読み取りを実装したオープンプルリクエストがあります:https://github.com/scrapy/scrapy/pull/1466 –