2012-03-13 5 views
1
私はいくつかのパラメータに応じてウェブサイトを評価するPythonの関数を書かれている

(一連の単語)とbr.open(URL)のハングを機械化。この関数はPython Mechanizeを使用しており、ほとんどの場合、うまく動作します。Pythonはself._sleep(一時停止)トレースバック

しかし、いくつかのウェブサイトのために、それはちょうど私が端末上で+ CをCTRLまでそこにハングアップします。私はこれは、JavaScriptの関連する問題のいくつかの並べ替えは、この周りのタイムアウト関数を構築する方法があると思いますか?

これは私の関数である:

def rateSite(site_url,comparisonWords): 
    #open the site 
    localBrowser = mechanize.Browser() 
    localBrowser.addheaders = [('User-agent', 'Mozilla/5.1 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/9.0.1')] 
    localBrowser.set_handle_robots(False) 
    site = localBrowser.open(site_url,timeout=5000) 
    html = site.read() 

    #rate the site 
    for i in comparisonWords.split(): 
     #do some rating math 

    return rating 

、これは私は、CTRL + Cに乗るトレースバックです:

site=localBrowser.open(site_url,timeout=5000) 
    File "/usr/lib/python2.7/dist-packages/mechanize/_mechanize.py", line 209, in open 
    return self._mech_open(url, data, timeout=timeout) 
    File "/usr/lib/python2.7/dist-packages/mechanize/_mechanize.py", line 236, in _mech_open 
    response = UserAgentBase.open(self, request, data) 
    File "/usr/lib/python2.7/dist-packages/mechanize/_opener.py", line 202, in open 
    response = meth(req, response) 
    File "/usr/lib/python2.7/dist-packages/mechanize/_http.py", line 612, in http_response 
    "http", request, response, code, msg, hdrs) 
    File "/usr/lib/python2.7/dist-packages/mechanize/_opener.py", line 219, in error 
    result = apply(self._call_chain, args) 
    File "/usr/lib/python2.7/urllib2.py", line 372, in _call_chain 
    result = func(*args) 
    File "/usr/lib/python2.7/dist-packages/mechanize/_http.py", line 146, in http_error_302 
    return self.parent.open(new) 
    File "/usr/lib/python2.7/dist-packages/mechanize/_mechanize.py", line 209, in open 
    return self._mech_open(url, data, timeout=timeout) 
    File "/usr/lib/python2.7/dist-packages/mechanize/_mechanize.py", line 236, in _mech_open 
    response = UserAgentBase.open(self, request, data) 
    File "/usr/lib/python2.7/dist-packages/mechanize/_opener.py", line 202, in open 
    response = meth(req, response) 
    File "/usr/lib/python2.7/dist-packages/mechanize/_http.py", line 612, in http_response 
    "http", request, response, code, msg, hdrs) 
    File "/usr/lib/python2.7/dist-packages/mechanize/_opener.py", line 219, in error 
    result = apply(self._call_chain, args) 
    File "/usr/lib/python2.7/urllib2.py", line 372, in _call_chain 
    result = func(*args) 
    File "/usr/lib/python2.7/dist-packages/mechanize/_http.py", line 146, in http_error_302 
    return self.parent.open(new) 
    File "/usr/lib/python2.7/dist-packages/mechanize/_mechanize.py", line 209, in open 
    return self._mech_open(url, data, timeout=timeout) 
    File "/usr/lib/python2.7/dist-packages/mechanize/_mechanize.py", line 236, in _mech_open 
    response = UserAgentBase.open(self, request, data) 
    File "/usr/lib/python2.7/dist-packages/mechanize/_opener.py", line 202, in open 
    response = meth(req, response) 
    File "/usr/lib/python2.7/dist-packages/mechanize/_http.py", line 612, in http_response 
    "http", request, response, code, msg, hdrs) 
    File "/usr/lib/python2.7/dist-packages/mechanize/_opener.py", line 219, in error 
    result = apply(self._call_chain, args) 
    File "/usr/lib/python2.7/urllib2.py", line 372, in _call_chain 
    result = func(*args) 
    File "/usr/lib/python2.7/dist-packages/mechanize/_http.py", line 146, in http_error_302 
    return self.parent.open(new) 
    File "/usr/lib/python2.7/dist-packages/mechanize/_mechanize.py", line 209, in open 
    return self._mech_open(url, data, timeout=timeout) 
    File "/usr/lib/python2.7/dist-packages/mechanize/_mechanize.py", line 236, in _mech_open 
    response = UserAgentBase.open(self, request, data) 
    File "/usr/lib/python2.7/dist-packages/mechanize/_opener.py", line 202, in open 
    response = meth(req, response) 
    File "/usr/lib/python2.7/dist-packages/mechanize/_http.py", line 578, in http_response 
    self._sleep(pause) 
KeyboardInterrupt 

これを解決するかのタイムアウトを構築する方法上の任意のヘルプ大変感謝します。

ありがとうございます!

答えて

1

timeout=5000は1時間以上です。 timeout=5を意味する場合があります。あきらめる前に、HTTPRedirectHandler.max_redirectionsを参照してくださいデフォルトmechanizeことで

は、最大で10のリダイレクトに従います。

+0

ありがとうございます。これはうまくいくようです。私は、タイムアウト値がミリ秒であると思った。 – dtrujillo