ファイル拡張子を変更するには？

Tax Foundationウェブサイトから「.xlsx」ファイルを削り取ろうとしています。悲しいことに、私はExcel cannot open the file '2017-FF-For-Website-7-10-2017.xlsx because the file format or file extension is not valid. verify that the file has not been corrupted and that the file extension matches the format of the fileというエラーメッセージを受け取ります。私はいくつかの調査を行い、これを修正する方法は、ファイル拡張子を '.xlsx'の代わりに '.xls'に変更することだと言います。誰も助けることができますか？ファイル拡張子を変更するには？

from bs4 import BeautifulSoup 
import urllib.request 
import os 

url = urllib.request.urlopen("https://taxfoundation.org/facts-figures-2017/") 

soup = BeautifulSoup(url, from_encoding=url.info().get_param('charset')) 

FHFA = os.chdir('C:/US_Census/Directory') 

seen = set() 
for link in soup.find_all('a', href=True): 
    href = link.get('href') 
    if not any(href.endswith(x) for x in ['.xlsx']): 
     continue 

    file = href.split('/')[-1] 
    filename = file.rsplit('.', 1)[0] 
    if filename not in seen: # only retrieve file if it has not been seen before 
     seen.add(filename) # add the file to the set 
     url = urllib.request.urlretrieve('https://taxfoundation.org/' + href, file) 
    print(filename) 

print(' ') 
print("All files successfully downloaded.")

P.S.私はあなたがファイルをダウンロードできることを知っていますが、私は特定のプロセスを自動化するためにWebを掻いています。

出典

2017-08-04 bhammer

使用しているPythonのバージョンは何ですか？ – TheDetective

この文は、['.xlsx'] '内のxに対して[' .xlsx ']のxのhref.endswith（x） ''が一回だけ実行されていない場合、 '' href.endswith（ 'xlsx'） '。基本的に 'if not href.endswith（ '。xlsx'）'で簡略化することができます。これは簡単です。 – Vinny

私はPython 3.6 @TheDetective – bhammer

問題がurl = urllib.request.urlretrieve('https://taxfoundation.org/' + href, file)行にありました。ウェブサイトにアクセスしてExcelのダウンロードボタンにカーソルを合わせると、はるかに長いリンクhttps://files.taxfoundation.org/20170710170238/2017-FF-For-Website-7-10-2017.xlsxがあることがわかります（2017....238に気づくでしょうか？）。 Excelファイルを正しくダウンロードしたことはありませんでした。ここではそうすることが正しい行です：他に

url = urllib.request.urlretrieve(href, file)

すべてが正常に働いていました。

出典

2017-08-04 13:53:06 TheDetective

素晴らしいです、ありがとうございました！ – bhammer

あなたは大歓迎です！私はビニーの提案も使用しましたが、それはまだ有効でした。 – TheDetective

ファイル拡張子を変更するには？

答えて

関連する問題