、私はブラジル議会の公開ウェブサイトから情報を取得しようとしているしてください:投票Selenium WebDriver findElement(By.xpath())の使用方法は?
のリストと投票セッション、日付とテーブルの名前このサイト:http://www2.camara.leg.br/atividade-legislativa/plenario/chamadaExterna.html?link=http://www.camara.gov.br/internet/votacao/mostraVotacao.asp?ideVotacao=6706&tipo=partido
私は、Python 3を使用し、セレンwebdriverをとPhantomJS:
from selenium import webdriver
path_to_phantomjs = '/Users/George/Documents/phantomjs/phantomjs-2.1.1-windows/bin/phantomjs'
browser = webdriver.PhantomJS(executable_path = path_to_phantomjs)
browser.get("http://www2.camara.leg.br/atividade-legislativa/plenario/chamadaExterna.html?link=http://www.camara.gov.br/internet/votacao/mostraVotacao.asp?ideVotacao=6706&tipo=partido")
nome_votacao = browser.find_element_by_xpath("//[@id='corpoVotacao']/p[3]/text()")
data_votacao = browser.find_element_by_xpath("//[@id='corpoVotacao']/div[1]/div[1]/div/div/p[1]/text()[1]")
list_deputados = browser.find_elements_by_xpath(".//table[@class='tabela-2']")
しかし、私は間違った場所
nome_votacaoを選択してるように見えますが、このエラーメッセージが表示されます:
---------------------------------------------------------------------------
InvalidSelectorException Traceback (most recent call last)
<ipython-input-11-e67933637ae0> in <module>()
----> 1 nome_votacao = browser.find_element_by_xpath("//[@id='corpoVotacao']/p[3]/text()")
c:\users\george\appdata\local\programs\python\python36-32\code\votos\lib\site-packages\selenium\webdriver\remote\webdriver.py in find_element_by_xpath(self, xpath)
363 driver.find_element_by_xpath('//div/td[1]')
364 """
--> 365 return self.find_element(by=By.XPATH, value=xpath)
366
367 def find_elements_by_xpath(self, xpath):
c:\users\george\appdata\local\programs\python\python36-32\code\votos\lib\site-packages\selenium\webdriver\remote\webdriver.py in find_element(self, by, value)
841 return self.execute(Command.FIND_ELEMENT, {
842 'using': by,
--> 843 'value': value})['value']
844
845 def find_elements(self, by=By.ID, value=None):
c:\users\george\appdata\local\programs\python\python36-32\code\votos\lib\site-packages\selenium\webdriver\remote\webdriver.py in execute(self, driver_command, params)
306 response = self.command_executor.execute(driver_command, params)
307 if response:
--> 308 self.error_handler.check_response(response)
309 response['value'] = self._unwrap_value(
310 response.get('value', None))
c:\users\george\appdata\local\programs\python\python36-32\code\votos\lib\site-packages\selenium\webdriver\remote\errorhandler.py in check_response(self, response)
192 elif exception_class == UnexpectedAlertPresentException and 'alert' in value:
193 raise exception_class(message, screen, stacktrace, value['alert'].get('text'))
--> 194 raise exception_class(message, screen, stacktrace)
195
196 def _value_or_default(self, obj, key, default):
InvalidSelectorException: Message: {"errorMessage":"Unable to locate an element with the xpath expression //[@id='corpoVotacao']/p[3]/text() because of the following error:\nError: INVALID_EXPRESSION_ERR: DOM XPath Exception 51","request":{"headers":{"Accept":"application/json","Accept-Encoding":"identity","Connection":"close","Content-Length":"118","Content-Type":"application/json;charset=UTF-8","Host":"127.0.0.1:55799","User-Agent":"Python http auth"},"httpVersion":"1.1","method":"POST","post":"{\"using\": \"xpath\", \"value\": \"//[@id='corpoVotacao']/p[3]/text()\", \"sessionId\": \"366665f0-be24-11e7-aa25-75da268b98e2\"}","url":"/element","urlParsed":{"anchor":"","query":"","file":"element","directory":"/","path":"/element","relative":"/element","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/element","queryKey":{},"chunks":["element"]},"urlOriginal":"/session/366665f0-be24-11e7-aa25-75da268b98e2/element"}}
Screenshot: available via screen
data_votacaoこのエラーメッセージ: "2015年10月11日午後08時15分":
---------------------------------------------------------------------------
InvalidSelectorException Traceback (most recent call last)
<ipython-input-12-24c43341f310> in <module>()
----> 1 data_votacao = browser.find_element_by_xpath("//[@id='corpoVotacao']/div[1]/div[1]/div/div/p[1]/text()[1]")
c:\users\george\appdata\local\programs\python\python36-32\code\votos\lib\site-packages\selenium\webdriver\remote\webdriver.py in find_element_by_xpath(self, xpath)
363 driver.find_element_by_xpath('//div/td[1]')
364 """
--> 365 return self.find_element(by=By.XPATH, value=xpath)
366
367 def find_elements_by_xpath(self, xpath):
c:\users\george\appdata\local\programs\python\python36-32\code\votos\lib\site-packages\selenium\webdriver\remote\webdriver.py in find_element(self, by, value)
841 return self.execute(Command.FIND_ELEMENT, {
842 'using': by,
--> 843 'value': value})['value']
844
845 def find_elements(self, by=By.ID, value=None):
c:\users\george\appdata\local\programs\python\python36-32\code\votos\lib\site-packages\selenium\webdriver\remote\webdriver.py in execute(self, driver_command, params)
306 response = self.command_executor.execute(driver_command, params)
307 if response:
--> 308 self.error_handler.check_response(response)
309 response['value'] = self._unwrap_value(
310 response.get('value', None))
c:\users\george\appdata\local\programs\python\python36-32\code\votos\lib\site-packages\selenium\webdriver\remote\errorhandler.py in check_response(self, response)
192 elif exception_class == UnexpectedAlertPresentException and 'alert' in value:
193 raise exception_class(message, screen, stacktrace, value['alert'].get('text'))
--> 194 raise exception_class(message, screen, stacktrace)
195
196 def _value_or_default(self, obj, key, default):
InvalidSelectorException: Message: {"errorMessage":"Unable to locate an element with the xpath expression //[@id='corpoVotacao']/div[1]/div[1]/div/div/p[1]/text()[1] because of the following error:\nError: INVALID_EXPRESSION_ERR: DOM XPath Exception 51","request":{"headers":{"Accept":"application/json","Accept-Encoding":"identity","Connection":"close","Content-Length":"143","Content-Type":"application/json;charset=UTF-8","Host":"127.0.0.1:55799","User-Agent":"Python http auth"},"httpVersion":"1.1","method":"POST","post":"{\"using\": \"xpath\", \"value\": \"//[@id='corpoVotacao']/div[1]/div[1]/div/div/p[1]/text()[1]\", \"sessionId\": \"366665f0-be24-11e7-aa25-75da268b98e2\"}","url":"/element","urlParsed":{"anchor":"","query":"","file":"element","directory":"/","path":"/element","relative":"/element","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/element","queryKey":{},"chunks":["element"]},"urlOriginal":"/session/366665f0-be24-11e7-aa25-75da268b98e2/element"}}
Screenshot: available via screen
そしてlist_deputadosは空のリストに
を生成data_votacaoは、このコンテンツを持っていることでした。そして、このXPathは検査中です:// * [@ id = "corpoVotacao"]/p [2]/text()[1]
nome_votaçãoはこのコンテンツを持っていました: "MPVNº688/2015 - PROJETO DE LEI DECONVERSÃO - 名目Eletrônica "。
そしてlist_deputadosは投票(名前、UF、および)を持つ完全なテーブルを持つことになっていました。投票)、行「Parlamentar」から開始します。このXPath:// * [@ id = "listagem"] /テーブルがあります。そしてクラス= "tabela-2"
誰でも正しい形式のコマンドを知っていますか?またはいくつかのチュートリアルですか?
ありがとうございました。私はUbuntuでマシンをテストしています。私は実行するとエラーが表示されます: –
path_to_phantomjsの= '/home/reinaldo/Documentos/phanthomjs/phantomjs-2.1.1-linux-x86_64' –
ドライバ= webdriver.PhantomJS(executable_path = path_to_phantomjs) –