私はBeautifulsoupとsoup.findAllを使用して関連情報にアクセスしましたが、私は<TR>
タグに1つの値(<TR>...</TR>
の間)を削除します。 どうすればいいですか? Pythonの2.7ウェブスクレイピングデータから要素を削除するにはどうすればよいですか?
.
.
.
soup = BeautifulSoup(x, 'lxml')
tab6col = soup.findAll('table', { "class" : "tab6col" })
ここに私のhtmlコード:
[<table border="0" class="tab6col" id="pm">\n<tr><td>\xa0</td><td align="right" class="contentword"><b>2015. \xe9v</b></td><td align="right" class="contentword"><b>2014. \xe9v</b></td><td align="right" class="contentword"><b>2013. \xe9v</b></td><td align="right" class="contentword"><b>2012. \xe9v</b></td><td align="right" class="contentword"><b>2011. \xe9v</b></td></tr><tr><td class="contentword"><b>Besz\xe1mol\xe1si id\xf5szak</b></td><td align="right" class="contentword"><span class="pm_idoszak">2015.01.01. - 2015.12.31.</span></td><td align="right" class="contentword"><span class="pm_idoszak">2014.01.01. - 2014.12.31.</span></td><td align="right" class="contentword"><span class="pm_idoszak">2013.12.30. - 2013.12.31.</span></td><td align="right" class="contentword"><span class="pm_idoszak">Nincs adat.</span></td><td align="right" class="contentword"><span class="pm_idoszak">Nincs adat.</span></td></tr><tr><td>\xa0</td><td align="right" class="contentword">eFt</td><td align="right" class="contentword">eFt</td><td align="right" class="contentword">eFt</td><td align="right" class="contentword">eFt</td><td align="right" class="contentword">eFt</td></tr><tr><td class="contentword">\xc9rt\xe9kes\xedt\xe9s nett\xf3 \xe1rbev\xe9tele</td><td align="right" class="numberc"></td><td align="right" class="numberc"></td><td align="right" class="numberc"></td><td align="right" class="numberc">Nincs adat.</td><td align="right" class="numberc">Nincs adat.</td></tr><tr><td class="contentword">Bev\xe9telek</td><td align="right" class="numberc">2 873 821</td><td align="right" class="numberc">3 162 742</td><td align="right" class="numberc">9 194</td><td align="right" class="numberc"></td><td align="right" class="numberc"></td></tr><tr><td class="contentword">\xdczemi eredm\xe9ny</td><td align="right" class="numberc">81 937</td><td align="right" class="numberc">-181 850</td><td align="right" class="numberc">1 755</td><td align="right" class="numberc">Nincs adat.</td><td align="right" class="numberc">Nincs adat.</td></tr><tr><td class="contentword">Ad\xf3z\xe1s el\xf5tti eredm\xe9ny</td><td align="right" class="numberc">-192 778</td><td align="right" class="numberc">-169 476</td><td align="right" class="numberc">1 755</td><td align="right" class="numberc">Nincs adat.</td><td align="right" class="numberc">Nincs adat.</td></tr><tr><td class="contentword">M\xe9rleg szerinti eredm\xe9ny</td><td align="right" class="numberc">-124 099</td><td align="right" class="numberc">0</td><td align="right" class="numberc">1 421</td><td align="right" class="numberc">Nincs adat.</td><td align="right" class="numberc">Nincs adat.</td></tr><tr><td class="contentword">Ad\xf3zott eredm\xe9ny</td><td align="right" class="numberc">-192 778</td><td align="right" class="numberc">-169 476</td><td align="right" class="numberc">1 579</td><td align="right" class="numberc">Nincs adat.</td><td align="right" class="numberc">Nincs adat.</td></tr><tr><td class="contentword">Eszk\xf6z\xf6k \xf6sszesen</td><td align="right" class="numberc">37 820 881</td><td align="right" class="numberc">40 695 842</td><td align="right" class="numberc">36 992 091</td><td align="right" class="numberc">Nincs adat.</td><td align="right" class="numberc">Nincs adat.</td></tr><tr><td class="contentword">Befektetett eszk\xf6z\xf6k</td><td align="right" class="numberc">18 668 826</td><td align="right" class="numberc">18 525 063</td><td align="right" class="numberc">16 925 711</td><td align="right" class="numberc">Nincs adat.</td><td align="right" class="numberc">Nincs adat.</td></tr><tr><td class="contentword">Forg\xf3eszk\xf6z\xf6k</td><td align="right" class="numberc">19 008 587</td><td align="right" class="numberc">21 877 275</td><td align="right" class="numberc">19 792 420</td><td align="right" class="numberc">Nincs adat.</td><td align="right" class="numberc">Nincs adat.</td></tr><tr><td class="contentword">P\xe9nzeszk\xf6z\xf6k</td><td align="right" class="numberc">947 015</td><td align="right" class="numberc">1 056 101</td><td align="right" class="numberc">1 307 515</td><td align="right" class="numberc">Nincs adat.</td><td align="right" class="numberc">Nincs adat.</td></tr><tr><td class="contentword">Akt\xedv id\xf5beli elhat\xe1rol\xe1sok</td><td align="right" class="numberc">143 468</td><td align="right" class="numberc">293 504</td><td align="right" class="numberc">273 960</td><td align="right" class="numberc">Nincs adat.</td><td align="right" class="numberc">Nincs adat.</td></tr><tr><td class="contentword">Saj\xe1t t\xf5ke</td><td align="right" class="numberc">2 141 319</td><td align="right" class="numberc">2 184 079</td><td align="right" class="numberc">2 353 554</td><td align="right" class="numberc">Nincs adat.</td><td align="right" class="numberc">Nincs adat.</td></tr><tr><td class="contentword">C\xe9ltartal\xe9kok</td><td align="right" class="numberc">29 656</td><td align="right" class="numberc">148 652</td><td align="right" class="numberc">18 960</td><td align="right" class="numberc">Nincs adat.</td><td align="right" class="numberc">Nincs adat.</td></tr><tr><td class="contentword">K\xf6telezetts\xe9gek</td><td align="right" class="numberc">35 541 531</td><td align="right" class="numberc">38 059 399</td><td align="right" class="numberc">34 233 518</td><td align="right" class="numberc">Nincs adat.</td><td align="right" class="numberc">Nincs adat.</td></tr><tr><td class="contentword">R\xf6vid lej\xe1rat\xfa k\xf6telezetts\xe9gek</td><td align="right" class="numberc">30 519 491</td><td align="right" class="numberc">30 426 014</td><td align="right" class="numberc">26 394 088</td><td align="right" class="numberc">Nincs adat.</td><td align="right" class="numberc">Nincs adat.</td></tr><tr><td class="contentword">Hossz\xfa lej\xe1rat\xfa k\xf6telezetts\xe9gek</td><td align="right" class="numberc">5 022 040</td><td align="right" class="numberc">7 633 385</td><td align="right" class="numberc">7 839 430</td><td align="right" class="numberc">Nincs adat.</td><td align="right" class="numberc">Nincs adat.</td></tr><tr><td class="contentword">Passz\xedv id\xf5beli elhat\xe1rol\xe1sok</td><td align="right" class="numberc">108 375</td><td align="right" class="numberc">303 712</td><td align="right" class="numberc">386 059</td><td align="right" class="numberc">Nincs adat.</td><td align="right" class="numberc">Nincs adat.</td></tr><tr><td class="contentword" colspan="6"><b>P\xe9nz\xfcgyi mutat\xf3k</b></td></tr><tr><td class="contentword">Elad\xf3sodotts\xe1g foka <span onmouseout="remove_hint();" onmouseover="show_hint(this, '<span style="color: red; font-weight: bold;">Elad\xf3sodotts\xe1g foka</span> (K\xf6telezetts\xe9gek/Eszk\xf6z\xf6k \xf6sszesen)<br><i>Megmutatja, hogy az eszk\xf6z \xe1llom\xe1ny milyen m\xe9rt\xe9kben van megterhelve k\xf6telezetts\xe9gv\xe1llal\xe1ssal. Min\xe9l kisebb a mutat\xf3 \xe9rt\xe9ke, ann\xe1l jobb a c\xe9g meg\xedt\xe9l\xe9se.</i>');" style="cursor: pointer; color: red; font-family: InformationLogo, Webdings;">i</span></td><td align="right" class="numberc"></td><td align="right" class="numberc"></td><td align="right" class="numberc"></td><td align="right" class="numberc">Nincs adat.</td><td align="right" class="numberc">Nincs adat.</td></tr><tr><td class="contentword">Elad\xf3sodotts\xe1g m\xe9rt\xe9ke - Bonit\xe1s <span onmouseout="remove_hint();" onmouseover="show_hint(this, '<span style="color: red; font-weight: bold;">Elad\xf3sodotts\xe1g m\xe9rt\xe9ke - Bonit\xe1s</span> (K\xf6telezetts\xe9gek/Saj\xe1t t\xf5ke)<br><i>Azt mutatja, hogy a saj\xe1t forr\xe1sok a k\xf6telezetts\xe9gek h\xe1ny sz\xe1zal\xe9k\xe1t fedezik. Pozit\xedv a c\xe9g meg\xedt\xe9l\xe9se, ha a mutat\xf3 \xe9rt\xe9ke tart\xf3san (j\xf3val) 1 alatt van.</i>');" style="cursor: pointer; color: red; font-family: InformationLogo, Webdings;">i</span></td><td align="right" class="numberc"></td><td align="right" class="numberc"></td><td align="right" class="numberc"></td><td align="right" class="numberc">Nincs adat.</td><td align="right" class="numberc">Nincs adat.</td></tr><tr><td class="contentword">\xc1rbev\xe9tel ar\xe1nyos eredm\xe9ny % <span onmouseout="remove_hint();" onmouseover="show_hint(this, '<span style="color: red; font-weight: bold;">\xc1rbev\xe9tel ar\xe1nyos eredm\xe9ny %</span> (Ad\xf3zott eredm\xe9ny/ Nett\xf3 \xe1rbev\xe9tel)\xd7100<br><i>A mutat\xf3 az \xe1rbev\xe9tel hat\xe9konys\xe1g\xe1t fejezi ki \xfagy, hogy az \xe1rbev\xe9tel nyeres\xe9gtartalm\xe1t sz\xe1zal\xe9kban szeml\xe9lteti. A c\xe9g meg\xedt\xe9l\xe9se ann\xe1l pozit\xedvabb, min\xe9l magasabb a sz\xe1zal\xe9k.</i>');" style="cursor: pointer; color: red; font-family: InformationLogo, Webdings;">i</span></td><td align="right" class="numberc"></td><td align="right" class="numberc"></td><td align="right" class="numberc"></td><td align="right" class="numberc">Nincs adat.</td><td align="right" class="numberc">Nincs adat.</td></tr><tr><td class="contentword">Likvidit\xe1si gyorsr\xe1ta <span onmouseout="remove_hint();" onmouseover="show_hint(this, '<span style="color: red; font-weight: bold;">Likvidit\xe1si gyorsr\xe1ta</span> ((Forg\xf3eszk\xf6z\xf6k-K\xe9szletek)/R\xf6vid lej.k\xf6telezetts\xe9gek)<br><i>Azt fejezi ki, hogy az egy \xe9v alatt p\xe9nzz\xe9 tehet\xf5 k\xe9szletek n\xe9lk\xfcli forg\xf3eszk\xf6z\xf6k milyen ar\xe1nyban k\xe9pesek az egy \xe9ven bel\xfcl esed\xe9kes k\xf6telezetts\xe9gek fedez\xe9s\xe9re, azaz milyen a c\xe9g r\xf6vid t\xe1v\xfa fizet\xf5k\xe9pess\xe9ge.<br>A c\xe9g meg\xedt\xe9l\xe9se akkor pozit\xedv, ha ez az ar\xe1ny egyre n\xf6vekv\xf5, ami az azonnali fizet\xf5k\xe9pess\xe9g javul\xe1s\xe1t jelzi.</i>');" style="cursor: pointer; color: red; font-family: InformationLogo, Webdings;">i</span></td><td align="right" class="numberc"></td><td align="right" class="numberc"></td><td align="right" class="numberc"></td><td align="right" class="numberc">Nincs adat.</td><td align="right" class="numberc">Nincs adat.</td></tr><tr><td class="contentword">Saj\xe1t t\xf5ke ar\xe1nya <span onmouseout="remove_hint();" onmouseover="show_hint(this, '<span style="color: red; font-weight: bold;">Saj\xe1t t\xf5ke ar\xe1nya </span> (Saj\xe1t t\xf5ke/Forr\xe1sok)');" style="cursor: pointer; color: red; font-family: InformationLogo, Webdings;">i</span></td><td align="right" class="numberc">0,06</td><td align="right" class="numberc">0,05</td><td align="right" class="numberc">0,06</td><td align="right" class="numberc"></td><td align="right" class="numberc"></td></tr><tr><td class="contentword">Eszk\xf6zar\xe1nyos nyeres\xe9g <span onmouseout="remove_hint();" onmouseover="show_hint(this, '<span style="color: red; font-weight: bold;">Eszk\xf6zar\xe1nyos nyeres\xe9g </span> (Ad\xf3zott eredm\xe9ny/Eszk\xf6z\xf6k)');" style="cursor: pointer; color: red; font-family: InformationLogo, Webdings;">i</span></td><td align="right" class="numberc">-0,01</td><td align="right" class="numberc">0,00</td><td align="right" class="numberc">0,00</td><td align="right" class="numberc"></td><td align="right" class="numberc"></td></tr><tr><td class="contentword">Bev\xe9telar\xe1nyos eredm\xe9ny <span onmouseout="remove_hint();" onmouseover="show_hint(this, '<span style="color: red; font-weight: bold;">Bev\xe9telar\xe1nyos eredm\xe9ny </span> (Ad\xf3zott eredm\xe9ny/Bev\xe9telek)');" style="cursor: pointer; color: red; font-family: InformationLogo, Webdings;">i</span></td><td align="right" class="numberc">-0,07</td><td align="right" class="numberc">-0,05</td><td align="right" class="numberc">0,17</td><td align="right" class="numberc"></td><td align="right" class="numberc"></td></tr><tr><td class="contentword">Saj\xe1t t\xf5ke ar\xe1nyos nyeres\xe9g <span onmouseout="remove_hint();" onmouseover="show_hint(this, '<span style="color: red; font-weight: bold;">Saj\xe1t t\xf5ke ar\xe1nyos nyeres\xe9g </span> (Ad\xf3zott eredm\xe9ny/Saj\xe1t t\xf5ke)');" style="cursor: pointer; color: red; font-family: InformationLogo, Webdings;">i</span></td><td align="right" class="numberc">-0,09</td><td align="right" class="numberc">-0,08</td><td align="right" class="numberc">0,00</td><td align="right" class="numberc"></td><td align="right" class="numberc"></td></tr><tr><td class="contentword" colspan="6"><b>L\xe9tsz\xe1m:</b> \xa0 136 f\xf5</td>\n</tr></table>]
と私は、この表には、この値を削除したい:
<tr><td class="contentword" colspan="6"><b>P\xe9nz\xfcgyi mutat\xf3k</b></td></tr>
私の完全なコード:
import urllib2
import unicodecsv as csv
import os
import sys
import io
import time
import datetime
import pandas as pd
from bs4 import BeautifulSoup
import MySQLdb
def to_2d(l,n):
return [l[i:i+n] for i in range(0, len(l), n)]
filename=r'output.csv'
resultcsv=open(filename,"wb")
output=csv.writer(resultcsv, delimiter=';',quotechar = '"', quoting=csv.QUOTE_NONNUMERIC, encoding='latin-1')
f = open('opten2.txt', 'r')
x = f.read()
soup = BeautifulSoup(x, 'lxml')
tab6col = soup.find('table', { "class" : "tab6col" })
datatable=[]
for record in tab6col.findAll('tr'):
for data in record.findAll('td'):
datatable.append(data.text.encode('latin-1'))
td = datatable.find("td", text="P\xe9nz\xfcgyi mutat\xf3k")
td.decompose()
maindatatable = to_2d(datatable, 6)
print maindatatable
output.writerows(maindatatable)
resultcsv.close()
申し訳ありませんが、正確に何を削除しますか?テーブル? – obskyr
サンプルサンプルを表示するか、入力と希望の出力を言う –
bsoupを使用してテーブルを取得しましたが、このTRの間の値を削除します。 私は自分の質問を更新し、理解しやすい完全なHTMLコードを挿入します。 – tardos93