2016-07-02 3 views
1

私はPythonの初心者です。私はHTML要素で動作するようにWebページのソースコードを取得しようとしています。Pythonでバイト(HTML)をデコードしたときにコードが見つからない

しかし、バイトをutf-8に変換すると、HTMLコードの一部が表示されなくなります。ここに私のコードは次のとおりです。たとえば

import urllib.request 

req = urllib.request.Request('http://avast.softonic.com/') 
response = urllib.request.urlopen(req) 
the_page = response.read() 

、IDが「the_page」で「review_data」であるDIVの内容は次のとおりです。

\n\n\t\t\t\t\t\t\t\t\t\t<div id="review_data" class="track_links">\n\t\t\t\t\t\t\t\t\t\t\t\t<p><!--[lead]-->Los expertos en soluciones antivirus gratuitas conocen bien el Avast Free Antivirus 2016, y probablemente ya lo hayan instalado alguna vez. Este software es <strong>uno de los l\xc3\xadderes en su campo</strong>, proporcionando un s\xc3\xb3lido conjunto de defensas contra virus y malware, as\xc3\xad como algunas otras herramientas \xc3\xbatiles que ni se imagina. Mejor a\xc3\xban, <strong>Avast es uno de los antivirus menos intrusivos</strong>, quiz\xc3\xa1 no tanto en los \xc3\xbaltimos a\xc3\xb1os, pero sigue siendo un sistema mucho menos acaparador que los dos grandes antivirus.\r<br /><!--[/lead]--></p>\r<p><!--[features]--><!--[subfeatures]--><h3>Lleno de caracter\xc3\xadsticas.</h3><!--[/subfeatures]--></p>\r<p>Una gran ventaja del Avast Free Antivirus 2016 es su conjunto de caracter\xc3\xadsticas. Aunque estas caracter\xc3\xadsticas han provocado que el tama\xc3\xb1o de instalaci\xc3\xb3n sea mayor (se recomienda hasta 2 GB de espacio de disco duro disponible), no deber\xc3\xada resultar un problema para la mayor\xc3\xada de los discos duros modernos, adem\xc3\xa1s incluye gran cantidad de herramientas de forma gratuita. Aparte de la exploraci\xc3\xb3n antivirus est\xc3\xa1ndar, que se mantiene firme con<strong> actualizaciones peri\xc3\xb3dicas</strong>, la \xc3\xbaltima versi\xc3\xb3n de Avast tiene la seguridad de red dom\xc3\xa9stica que detecta vulnerabilidades para todos los dispositivos conectados a la red. <strong>La \xc3\xbaltima versi\xc3\xb3n, la actualizaci\xc3\xb3n \'Nitro\', tambi\xc3\xa9n a\xc3\xb1ade un navegador dedicado llamado Avast SafeZone</strong>. Aclamado como el navegador m\xc3\xa1s seguro del mundo, es a la vez un software inflado con car\xc3\xa1cter gratuito. Para aquellos a los que les importa la seguridad, especialmente en lo que se refiere a cuestiones bancarias, el programa resulta ser una bendici\xc3\xb3n. El <strong>bloqueador de anuncios incorporado</strong> puede ser un regalo del cielo a la hora de visitar ciertos sitios. Otra nueva caracter\xc3\xadstica es Cybercapture, lo que pone en cuarentena los archivos entrantes sospechosos. Las v\xc3\xadctimas de los virus sabr\xc3\xa1n la importancia de este buffer.\r<br /><!--[/features]--></p>\r<p><!--[usability]--><!--[subusability]--><h3>Una interfaz sencilla y eficaz</h3><!--[/subusability]--></p>\r<p>Avast ha cambiado varias veces a lo largo de los a\xc3\xb1os y la actualizaci\xc3\xb3n Nitro no es una excepci\xc3\xb3n, pero por suerte su dise\xc3\xb1o parece haber permanecido constante. El programa es <strong>simple y f\xc3\xa1cil de usar, con botones definidos y textos claros</strong> en colores agradables. Avast Free Antivirus 2016 se asentar\xc3\xa1 en la bandeja del sistema hasta que se necesite, al igual que la mayor\xc3\xada del software antivirus, se expande cuando se abre en una ventana peque\xc3\xb1a sin fronteras con apariencia elegante y coincide con el esquema de dise\xc3\xb1o de Windows 10. La mayor\xc3\xada de las secciones de este programa son bastante f\xc3\xa1ciles de seguir, con un gran conjunto de botones para las herramientas e iconos est\xc3\xa1ndar, como una rueda dentada para acceder a la configuraci\xc3\xb3n. Por supuesto, siempre puedes actualizar pulsando el bot\xc3\xb3n premium, anim\xc3\xa1ndole a descargar y pagar por Avast Premier. Sin embargo, esto no es obligatorio. Cada una de las principales caracter\xc3\xadsticas de Avast tiene su propia secci\xc3\xb3n, tales como la seguridad de Internet, el navegador SafeZone y la exploraci\xc3\xb3n inteligente, as\xc3\xad que realmente nada puede ir mal.\r<br /><!--[/usability]--></p>\r<p><!--[conclusion]--><!--[subconclusion]--><h3>Las mejores cosas de la vida son gratis</h3><!--[/subconclusion]--></p>\r<p>Para un programa gratuito, <strong>Avast es realmente excelente</strong>. S\xc3\xad, se ha perdido algo de su sensaci\xc3\xb3n m\xc3\xa1s independiente de ediciones pasadas, pero eso es solo un peque\xc3\xb1o precio para un software libre de estas caracter\xc3\xadsticas. Avast Free Antivirus 2016 es menos intrusivo en su navegaci\xc3\xb3n diaria y es muy sencillo de utilizar, por lo que sigue siendo una de las principales soluciones gratuitas.\r<br /><!--[/conclusion]--></p>\n\t\t\t\t\t\t\t\t\t\t\t</div> 

しかし、私は、次のもののいずれかを実行してみてください:

import urllib.request 

req = urllib.request.Request('http://avast.softonic.com/') 
response = urllib.request.urlopen(req) 
the_page = response.read() 
html_missing_elements = the_page.decode('utf-8') 

または:

import requests 

r =requests.get('http://avast.softonic.com/') 
html_missing_elements = r.text 

または:

import urllib.request 
from bs4 import BeautifulSoup 

req = urllib.request.Request('http://avast.softonic.com/') 
response = urllib.request.urlopen(req) 
the_page = response.read() 
html_missing_elements = BeautifulSoup(the_page) 

例に従えID「review_data」とDIVのみが含まれ、:私はページの完全オリジナルのHTMLコードを取得することはできません

<div id="review_data" class="track_links"><br /><!--[/conclusion]--></p></div> 

、コード行方不明があります私は理由を知りたい。

ありがとうございました。

\r<br /><!--[/lead]--></p>\r 
>\r<p>A big plus point for Avast Free Antivirus 2016 

と、より多く:

答えて

1

はHTMLに埋め込ま\rすなわち、いくつかのキャリッジリターンがあります。

あなたはすべてがあなたのIDEで正常に動作します、あなたはそれを印刷するときに、タグの内容を見ることができることを削除すると:

soup = BeautifulSoup(r.content.replace(b"\r",b"")) 
print(soup.select_one("#review_data")) 

データが実際には、お使いのIDEは、という理由だけでのそれを示していません改行:

soup = BeautifulSoup(r.content,"lxml") 
print(soup.select_one("#review_data")) 

がpycharm意志出力を使用して:

<div class="track_links" id="review_data"> 
<br/><!--[/conclusion]--></p> 
</div> 

のBu使用したT:

print(soup.select_one("#review_data").text) 

ウィルは、出力:ipython使用して同じコードを実行していたあなた、あなたは正しい出力を見たい場合

\nConnoisseurs of free antivirus solutions will already know of Avast Free Antivirus 2016 and have probably installed it at some point or another. This software is one of the leaders in its field, providing a robust suite of defences against viruses and malware, as well as some other useful tools that you might not expect. Better still, Avast is one of the less intrusive antivirus programs- perhaps less so in recent years, but still a lot less system-hogging than the big two.\r Brimming with features A big plus point for Avast Free Antivirus 2016 is its suite of features. Although these features have caused its install size to increase (up to 2GB hard drive space is recommended!), it shouldn’t prove an issue for most modern hard drives and you do get a lot of tools for free. Aside from the standard antivirus scanning, which is kept sharp with constant updates, the latest version of Avast has home network security which detects vulnerabilities for all devices connected to your network. The latest version, the ‘Nitro’ update, also adds a dedicated Avast browser called SafeZone. Heralded as the world’s safest browser, this could equally be argued as bloatware and a great free feature. For those who are security conscious, especially regarding banking, it should be seen as beneficial. The in-built ad blocker can be a godsend when visiting certain sites. Another new feature is CyberCapture, which quarantines any suspicious incoming files. Victims of viruses will know the importance of this buffer.\r A simple and effective interface Avast has changed a few times over the years and the Nitro update is no different, but thankfully their design approach seems to have remained constant. The program is simple and straightforward to use, with bold buttons and clear text in friendly colours. Avast Free Antivirus 2016 will sit in the system tray until needed, like most antivirus software, then expand when opened into a small borderless window that looks sleek matching the Windows 10 design scheme. Most sections of this are easy enough to follow, with a large set of buttons for the tools and standard icons like a cog for accessing settings. Of course, you’re also never far away from a premium upgrade button, encouraging you to download and pay for Avast Premier. However, this is not forced upon you. Each of the main features of Avast has its own section, such as internet security, the SafeZone browser and Smart Scan, so you really can’t go wrong.\r The best things in life are free For a free program, Avast is pretty impressive. Yes, it has lost some of its independent feel as the years have gone by, but that’s a small price for a great bit of free software. Avast Free Antivirus 2016 will interfere with your everyday browsing less than the bigger names in software. It’s very simple to use, therefore remains one of the top free solutions.\r\n' 

だけsoup = BeautifulSoup(r.content,"lxml")使用しては:

In [5]: soup = BeautifulSoup(r.content,"lxml") 

In [6]: soup.select_one("#review_data") 
Out[6]: 
<div class="track_links" id="review_data"> 
<p><!--[lead]-->Connoisseurs of free antivirus solutions will already know of Avast Free Antivirus 2016 and have probably installed it at some point or another. This software is one of the leaders in its field, providing a <strong>robust suite of defences against viruses and malware</strong>, as well as some other useful tools that you might not expect. Better still, Avast is one of the less intrusive antivirus ` 
<br/><!--[/lead]--></p> <p><!--[features]--><!--[subfeatures]--></p><h3>Brimming with features</h3><!--[/subfeatures]--> <p>A big plus point for Avast Free Antivirus 2016 is its suite of features. Although these features have caused its install size to increase (up to 2GB hard drive space is recommended!), it shouldn’t prove an issue for most modern hard drives and you do get a lot of tools for free.</p> <p>Aside from the standard antivirus scanning, which is kept sharp with constant updates, the latest version of Avast has <strong>home network security</strong> which detects vulnerabilities for all devices connected to your network.</p> <p>The latest version, the ‘Nitro’ update, also adds a dedicated Avast browser called <strong>SafeZone</strong>. Heralded as the world’s safest browser, this could equally be argued as bloatware and a great free feature. For those who are security conscious, especially regarding banking, it should be seen as beneficial. The in-built ad blocker can be a godsend when visiting certain sites. Another new feature is <strong>CyberCapture</strong>, which quarantines any suspicious incoming files. Victims of viruses will know the importance of this buffer. 
<br/><!--[/features]--></p> <p><!--[usability]--><!--[subusability]--></p><h3>A simple and effective interface</h3><!--[/subusability]--> <p>Avast has changed a few times over the years and the <strong>Nitro update</strong> is no different, but thankfully their design approach seems to have remained constant. The program is <strong>simple and straightforward</strong> to use, with bold buttons and clear text in friendly colours.</p> <p>Avast Free Antivirus 2016 will sit in the system tray until needed, like most antivirus software, then expand when opened into a small borderless window that looks sleek matching the Windows 10 design scheme. Most sections of this are easy enough to follow, with a large set of buttons for the tools and standard icons like a cog for accessing settings.</p> <p>Of course, you’re also never far away from a premium upgrade button, encouraging you to download and pay for <a href="http://avast-premier-antivirus.en.softonic.com" title="Avast Premier">Avast Premier</a>. However, this is not forced upon you.</p> <p>Each of the main features of Avast has its own section, such as <strong>internet security</strong>, the SafeZone browser and <strong>Smart Scan</strong>, so you really can’t go wrong. 
<br/><!--[/usability]--></p> <p><!--[conclusion]--><!--[subconclusion]--></p><h3>The best things in life are free</h3><!--[/subconclusion]--> <p>For a free program, Avast is pretty impressive. Yes, it has lost some of its independent feel as the years have gone by, but that’s a small price for a great bit of free software. Avast Free Antivirus 2016 will interfere with your everyday browsing less than the bigger names in software. It’s very simple to use, therefore remains <strong>one of the top free solutions</strong>. 
<br/><!--[/conclusion]--></p> 
</div> 

それが持っていますエンコーディングとは何の関係もなく、コードを実行している場所であれば、出力を妨げるのはキャリッジリターンです。以下の簡単な例を実行すると、出力がどのように実行されるかを確認できます。

In [14]: s = "foo\bar" 

In [15]: print(s) 
foar 
関連する問題