2017-02-21 2 views
0
http://www.kif.re.kr/kif2/publication/pub_list.aspx?menuid=17 

私はクローラを作っています。しかし、私は次のページに行くことができません。私は次のページに行きたいです。クロール中にページを移動できません

<a class="pagebutton" href="javascript:__doPostBack('ctl00$ContentPlaceHolder1$data_list1$WebPageNavigatorV21$ctl12','')">2</a> 

これは2ページ目のhtmlコードです。

開発者モードで検索した結果、postメソッドです。

Request URL:http://www.kif.re.kr/kif2/publication/pub_list.aspx?menuid=17 

以下は、開発者モードで見つかったデータです。

__EVENTTARGET:ctl00$ContentPlaceHolder1$data_list1$WebPageNavigatorV21$ctl12 
__EVENTARGUMENT: 
__VIEWSTATE:/wEPDwUKMTg4Nzc2Nzc3OA9kFgJmD2QWAgIED2QWAgIDD2QWAmYPZBYEZg9kFgQCAQ8QZBAVBwbsoJzrqqkG7KCA7J6QDOuwnOqwhOyXsOyblAbqtoztmLgJ7J2Y66Kw7LKYBuuqqeywqAbsmpTslb0VBwgyNDAvMjQxLwgyNDAvMjQyLwgyNDAvMjU5LwgyNDAvMjYyLwgyNDAvMzM3LwgyNDAvMjYzLwgyNDAvMjY2LxQrAwdnZ2dnZ2dnZGQCAw8PZBYCHgpvbmtleXByZXNzBV5pZiAoZXZlbnQua2V5Q29kZSA9PSAxMykge19fZG9Qb3N0QmFjaygnY3RsMDAkQ29udGVudFBsYWNlSG9sZGVyMSRkYXRhX2xpc3QxJGlidFNlYXJjaCcsJycpfTsgZAIDDw9kFgIeBWFsaWduBQZjZW50ZXIWAgIDD2QWAmYPZBYCZg9kFhYCBA8PFggeBFRleHQFATEeCENzc0NsYXNzBQdjdXJyZW50HgRfIVNCAgIeB1Zpc2libGVnZGQCBg8PFggfAgUBMh8DBQpwYWdlYnV0dG9uHwQCAh8FZ2RkAggPDxYIHwIFATMfAwUKcGFnZWJ1dHRvbh8EAgIfBWdkZAIKDw8WCB8CBQE0HwMFCnBhZ2VidXR0b24fBAICHwVnZGQCDA8PFggfAgUBNR8DBQpwYWdlYnV0dG9uHwQCAh8FZ2RkAg4PDxYIHwIFATYfAwUKcGFnZWJ1dHRvbh8EAgIfBWdkZAIQDw8WCB8CBQE3HwMFCnBhZ2VidXR0b24fBAICHwVnZGQCEg8PFggfAgUBOB8DBQpwYWdlYnV0dG9uHwQCAh8FZ2RkAhQPDxYIHwIFATkfAwUKcGFnZWJ1dHRvbh8EAgIfBWdkZAIWDw8WCB8CBQIxMB8DBQpwYWdlYnV0dG9uHwQCAh8FZ2RkAhsPDxYCHwIFCyZuYnNwWzEvODNdZGQYAQUeX19Db250cm9sc1JlcXVpcmVQb3N0QmFja0tleV9fFgcFGWN0bDAwJG1lbnVfbmF2MSRpYnRTZWFyY2gFLmN0bDAwJENvbnRlbnRQbGFjZUhvbGRlcjEkZGF0YV9saXN0MSRpYnRTZWFyY2gFMWN0bDAwJENvbnRlbnRQbGFjZUhvbGRlcjEkZGF0YV9saXN0MSRpYnRTZWFyY2hBbGwFPmN0bDAwJENvbnRlbnRQbGFjZUhvbGRlcjEkZGF0YV9saXN0MSRXZWJQYWdlTmF2aWdhdG9yVjIxJGN0bDA2BT5jdGwwMCRDb250ZW50UGxhY2VIb2xkZXIxJGRhdGFfbGlzdDEkV2ViUGFnZU5hdmlnYXRvclYyMSRjdGwwOAU+Y3RsMDAkQ29udGVudFBsYWNlSG9sZGVyMSRkYXRhX2xpc3QxJFdlYlBhZ2VOYXZpZ2F0b3JWMjEkY3RsMzAFPmN0bDAwJENvbnRlbnRQbGFjZUhvbGRlcjEkZGF0YV9saXN0MSRXZWJQYWdlTmF2aWdhdG9yVjIxJGN0bDMyuFjgj5nepdWXkOAwNYww+divJYtYSrYgHZpTcewu9Ds= 
__VIEWSTATEGENERATOR:E95FE49A 
__EVENTVALIDATION:/wEdACHcOKX2MiW8o3JKug67fnRBm/LuJNf32p7npb2HQkdSHj2jQIPNrpQqFhY2rmhcQzOr90YGqna/Dtr3eCnJKH/FRrctoJJXOcc5nzwqquFEKe/f6ybfmfBBwP5V9TZX05svUiuWBMoi40eiFXgXu/HvnPjbm91I+Oz3HACj/rejcfKu91e/rwNa3qahKk8QP//P3Ctl3lcnXTxti+MHToVFJ4X5e7akN9M5YNbryOCPFUzWTSqkhEUajNOJze2BA47TqM8vDP0IP5ki4KWYQixH1ITUrNZx490LfBrUZBBPZp6DDFbb0FBaxN5KpyeciB3wOyFRvNC7wvyrzR4zZIFKvsDwEoIoZw4QpAfkYvtGlm/erM6tYMUIO2Y+EofXRtI5fpcvmMZwp9oWz1DjjMQ7kMX3NKB1EbRuWhW/PUV26RCgECz38VETCqQlHmY2JJfazoydmTWb206Gy1R0dPzbnPz5BKeIBWlSOZDH/jTFFrzBKTtWpKGoPFsObJHPJ/aat3bwhGesAEcXWRHlLMcB7+Yj6K/9RPZv/XJ9M8z/IAbi3aAtkyVcWc7DpsPsia8+XWZOcmYS4tf4O30N13XKSyM1xB3zywxlTxuxx1lP5+GDugiF+Yf+KojuR7Az4t0LDho3RsEd/ZN7ejUxBtxfh6oqlZNMy4/Raz+OSUeRTRVfoUMGNPEUTwp88pek/ycTkyMA26w5UfW8JGdFRvrmOA59JlLF9OIGGWESn/RCnw== 
ctl00$agentPlatform:1 
ctl00$menu_nav1$tbxSearchWord: 
ctl00$ContentPlaceHolder1$data_list1$ddlSearchItem:240/241/ 
ctl00$ContentPlaceHolder1$data_list1$tbxSearch: 
ctl00$ContentPlaceHolder1$data_list1$hdnSearchText: 
ctl00$ContentPlaceHolder1$data_list1$hdnSearchPath:240/241/ 
ctl00$ContentPlaceHolder1$data_list1$WebPageNavigatorV21$ctl00:0 
ctl00$ContentPlaceHolder1$data_list1$WebPageNavigatorV21$ctl01:1 
ctl00$ContentPlaceHolder1$data_list1$WebPageNavigatorV21$ctl02:821 

以下は私のコードです。コード自体にエラーはありません。 しかし、r.textの値は私が望むものではありません。

 url = 'http://www.kif.re.kr/kif2/publication/pub_list.aspx?menuid=17' 
     source_code = requests.get(url) 
     plain_text = source_code.text 
     soup = BeautifulSoup(plain_text, 'lxml') 

     pageTag = soup.findAll('td',align='center') 

     inputTag = pageTag[0].findAll('a') 

     for link in inputTag: 
      print(link['href']) 
      payload = {'__EVENTTARGET ' :'ctl00$ContentPlaceHolder1$data_list1$WebPageNavigatorV21$ctl12', 
         '__EVENTARGUMENT' : '', 
         '__VIEWSTATE' 
         : '/wEPDwUKMTg4Nzc2Nzc3OA9kFgJmD2QWAgIED2QWAgIDD2QWAmYPZBYEZg9kFgQCAQ8QZBAVBwbsoJzrqqkG7KCA7J6QDOuwnOqwhOyXsOyblAbqtoztmLgJ7J2Y66Kw7LKYBuuqqeywqAbsmpTslb0VBwgyNDAvMjQxLwgyNDAvMjQyLwgyNDAvMjU5LwgyNDAvMjYyLwgyNDAvMzM3LwgyNDAvMjYzLwgyNDAvMjY2LxQrAwdnZ2dnZ2dnZGQCAw8PZBYCHgpvbmtleXByZXNzBV5pZiAoZXZlbnQua2V5Q29kZSA9PSAxMykge19fZG9Qb3N0QmFjaygnY3RsMDAkQ29udGVudFBsYWNlSG9sZGVyMSRkYXRhX2xpc3QxJGlidFNlYXJjaCcsJycpfTsgZAIDDw9kFgIeBWFsaWduBQZjZW50ZXIWAgIDD2QWAmYPZBYCZg9kFhYCBA8PFggeBFRleHQFATEeCENzc0NsYXNzBQdjdXJyZW50HgRfIVNCAgIeB1Zpc2libGVnZGQCBg8PFggfAgUBMh8DBQpwYWdlYnV0dG9uHwQCAh8FZ2RkAggPDxYIHwIFATMfAwUKcGFnZWJ1dHRvbh8EAgIfBWdkZAIKDw8WCB8CBQE0HwMFCnBhZ2VidXR0b24fBAICHwVnZGQCDA8PFggfAgUBNR8DBQpwYWdlYnV0dG9uHwQCAh8FZ2RkAg4PDxYIHwIFATYfAwUKcGFnZWJ1dHRvbh8EAgIfBWdkZAIQDw8WCB8CBQE3HwMFCnBhZ2VidXR0b24fBAICHwVnZGQCEg8PFggfAgUBOB8DBQpwYWdlYnV0dG9uHwQCAh8FZ2RkAhQPDxYIHwIFATkfAwUKcGFnZWJ1dHRvbh8EAgIfBWdkZAIWDw8WCB8CBQIxMB8DBQpwYWdlYnV0dG9uHwQCAh8FZ2RkAhsPDxYCHwIFCyZuYnNwWzEvODNdZGQYAQUeX19Db250cm9sc1JlcXVpcmVQb3N0QmFja0tleV9fFgcFGWN0bDAwJG1lbnVfbmF2MSRpYnRTZWFyY2gFLmN0bDAwJENvbnRlbnRQbGFjZUhvbGRlcjEkZGF0YV9saXN0MSRpYnRTZWFyY2gFMWN0bDAwJENvbnRlbnRQbGFjZUhvbGRlcjEkZGF0YV9saXN0MSRpYnRTZWFyY2hBbGwFPmN0bDAwJENvbnRlbnRQbGFjZUhvbGRlcjEkZGF0YV9saXN0MSRXZWJQYWdlTmF2aWdhdG9yVjIxJGN0bDA2BT5jdGwwMCRDb250ZW50UGxhY2VIb2xkZXIxJGRhdGFfbGlzdDEkV2ViUGFnZU5hdmlnYXRvclYyMSRjdGwwOAU+Y3RsMDAkQ29udGVudFBsYWNlSG9sZGVyMSRkYXRhX2xpc3QxJFdlYlBhZ2VOYXZpZ2F0b3JWMjEkY3RsMzAFPmN0bDAwJENvbnRlbnRQbGFjZUhvbGRlcjEkZGF0YV9saXN0MSRXZWJQYWdlTmF2aWdhdG9yVjIxJGN0bDMyuFjgj5nepdWXkOAwNYww+divJYtYSrYgHZpTcewu9Ds=', 
         '__VIEWSTATEGENERATOR' : 'E95FE49A', 
         '__EVENTVALIDATION' : '/wEdACHcOKX2MiW8o3JKug67fnRBm/LuJNf32p7npb2HQkdSHj2jQIPNrpQqFhY2rmhcQzOr90YGqna/Dtr3eCnJKH/FRrctoJJXOcc5nzwqquFEKe/f6ybfmfBBwP5V9TZX05svUiuWBMoi40eiFXgXu/HvnPjbm91I+Oz3HACj/rejcfKu91e/rwNa3qahKk8QP//P3Ctl3lcnXTxti+MHToVFJ4X5e7akN9M5YNbryOCPFUzWTSqkhEUajNOJze2BA47TqM8vDP0IP5ki4KWYQixH1ITUrNZx490LfBrUZBBPZp6DDFbb0FBaxN5KpyeciB3wOyFRvNC7wvyrzR4zZIFKvsDwEoIoZw4QpAfkYvtGlm/erM6tYMUIO2Y+EofXRtI5fpcvmMZwp9oWz1DjjMQ7kMX3NKB1EbRuWhW/PUV26RCgECz38VETCqQlHmY2JJfazoydmTWb206Gy1R0dPzbnPz5BKeIBWlSOZDH/jTFFrzBKTtWpKGoPFsObJHPJ/aat3bwhGesAEcXWRHlLMcB7+Yj6K/9RPZv/XJ9M8z/IAbi3aAtkyVcWc7DpsPsia8+XWZOcmYS4tf4O30N13XKSyM1xB3zywxlTxuxx1lP5+GDugiF+Yf+KojuR7Az4t0LDho3RsEd/ZN7ejUxBtxfh6oqlZNMy4/Raz+OSUeRTRVfoUMGNPEUTwp88pek/ycTkyMA26w5UfW8JGdFRvrmOA59JlLF9OIGGWESn/RCnw==', 
         'ctl00$agentPlatform' : '1', 
         'ctl00$menu_nav1$tbxSearchWord' : '', 
         'ctl00$ContentPlaceHolder1$data_list1$ddlSearchItem' : '240/241/', 
         'ctl00$ContentPlaceHolder1$data_list1$tbxSearch' : '', 
         'ctl00$ContentPlaceHolder1$data_list1$hdnSearchText':'', 
         'ctl00$ContentPlaceHolder1$data_list1$hdnSearchPath' : '240/241/', 
         'ctl00$ContentPlaceHolder1$data_list1$WebPageNavigatorV21$ctl00' : '0', 
         'ctl00$ContentPlaceHolder1$data_list1$WebPageNavigatorV21$ctl01' : '1', 
         'ctl00$ContentPlaceHolder1$data_list1$WebPageNavigatorV21$ctl02' : '821' 
         } 
      r = requests.post('http://www.kif.re.kr/kif2/publication/pub_list.aspx?menuid=17', data=payload) 

      print(r.text) 
      return 

次のページに移動するにはどうすればよいですか?

答えて

0
'ctl00$ContentPlaceHolder1$data_list1$WebPageNavigatorV21$ctl01' : '1' 

これはページ番号を制御します。それはゼロベースです。あなたは1

に変更するよりも、2ページに移動したい場合、あなたはそれを、変数を作る必要があります。一つだけのページが続く

for i in range(page_number): 
    .... 
    'ctl00$ContentPlaceHolder1$data_list1$WebPageNavigatorV21$ctl01' : i 
    .... 
+0

@。 – StackQ

+0

@私はそれを解決しました。最後に、セレンをクリックするようにページを移動しなければなりませんでした。 – StackQ

関連する問題