スクラピーを使用して次のリンクを取得する

次のページリンクを認識するためにスクラピーを取得するのが難しいです。 xpath // aを使用すると、リンクは表示されません。試してみたスクラピーを使用して次のリンクを取得する

response.xpath("//*[@id='nextpage']/a").extract()

複数の他の並べ替えと一緒に運がありません。私はここでhref="pdetail.php?instnum=2016230702&year=2016"リンク

を解析しようとしているコードです：

<div class=""><br> 
 
<table width="95%" align="center"> 
 
    <tbody><tr> 
 
     <td class=""></td> 
 
     <td align="center" class=""> 
 
      <h3 style="" class="Header"> 
 
       Detail Information For Instrument # 2016230701 In Year 2016   </h3> 
 
     </td> 
 

 
     <td class=""></td> 
 
    </tr> 
 
<tr> 
 
    <td class=""><div style="float:left;margin-left:30px;" id="previouspage" class=""><a href="pdetail.php?instnum=2016230700&amp;year=2016"><button style="font-size:18px;font-family: arial" type="button" class="">Previous Page</button></a> </div></td> 
 
    <td class=""></td> 
 
    <td class=""><div style="float:right;" id="nextpage" class=""><a href="pdetail.php?instnum=2016230702&amp;year=2016"><button style="font-size:18px;font-family: arial" type="button" class="">Next Page</button></a></div></td> 
 
</tr> 
 
</tbody></table>

私は、XPathの順列を実行して、私は次のループを取得する - ページ自体にコールバックすると：

2016-09-24 18:26:03 [scrapy] DEBUG: Crawled (200) <GET http://search.jeffersondeeds.com/pdetail.php?instnum=2016230701&year=2016&db=0&cnum=20> (referer: http://search.jeffersondeeds.com/pdetail.php?instnum=2016230701&year=2016&db=0&cnum=20)

出典

2016-09-24 Marcus Streips

http://stackoverflow.com/questions/36281413/scrapy-getting-href-out-of-divあなたがw3schoolのXPathのチュートリアルを訪問したいと思うかもしれhttp://www.w3schools.com/xsl/xpath_intro .asp –

こののXPathをお試しください：

string(//*[@id="nextpage"]/a/@href)

出典

2016-09-24 21:13:03

私のスクリプトでは、現在のURLとループをプルアップします。治療のシェルでは何も表示されません。これは私がコマンドラインで見るものです：2016-09-24 17:57:55 [scrapy] DEBUG：Crawled（200）（referer：http://search.jeffersondeeds.com/pdetail.php?instnum=2016230701&year=2016&db=0&cnum=20） –

'a'リンクの直後にボタンをクリックしようとしたかもしれません –

ありがとうGilles - それは私のxpathヘルパー拡張による正しいxpathでした。今私はそれを認識するために私のスクリプトが必要です！ –

スクラピーを使用して次のリンクを取得する

答えて

関連する問題