Pythonが異常なタグ名（アトム：リンク）を持つXMLを解析します

私はhrefを以下のXMLから解析しようとしています。以下に複数のworkspaceタグがあります。私はちょうど1つを示します。Pythonが異常なタグ名（アトム：リンク）を持つXMLを解析します

<workspaces> 
    <workspace> 
    <name>practice</name> 
    <atom:link xmlns:atom="http://www.w3.org/2005/Atom" rel="alternate" href="https://www.my-geoserver.com/geoserver/rest/workspaces/practice.xml" type="application/xml"/> 
    </workspace> 
</workspaces>

上記リクエスト・ライブラリを使用してrequests.getコマンドから来ている：

myUrl = 'https://www.my-geoserver.com/geoserver/rest/workspaces' 
headers = {'Accept': 'text/xml'} 
resp = requests.get(myUrl,auth=('admin','password'),headers=headers)

私は 'ワークスペース' を検索した場合、私はオブジェクトが返されます：

になり

lst = tree.findall('workspace') 
print(lst)

：

[<Element 'workspace' at 0x039E70F0>, <Element 'workspace' at 0x039E71B0>, <Element 'workspace' at 0x039E7240>]

、罰金[OK]をしかし、どのように私は、文字列のうち、テキストのhrefを得るか、私が試してみました：

lst = tree.findall('atom') 
lst = tree.findall('atom:link') 
lst = tree.findall('workspace/atom:link')

しかし、それらのどれもタグを隔離する働きません、実際には最後のものは、エラー

SyntaxError: prefix 'atom' not found in prefix map

を作成します

これらのタグ名でhrefのインスタンスをすべて取得するにはどうすればよいですか？私が見つけた

出典

2017-06-20 Single Entity

、（この場合はで）コロンの前の部分は、名前空間として知られており、ここでの問題を引き起こしています。解決策はかなり簡単です：

myUrl = 'https://www.my-geoserver.com/geoserver/rest/workspaces' 
headers = {'Accept': 'text/xml'} 
resp = requests.get(myUrl,auth=('admin','my_password'),headers=headers) 
stuff = resp.text 
to_parse=BeautifulSoup(stuff, "xml") 

for item in to_parse.find_all("atom:link"): 
    print(item)

私はBeautifulSoupライブラリに向かって私を指してくれてありがとうございます。キーは、BeautifulSoup関数の引数としてxmlを使用していました。 lxmlを使用すると、名前空間を適切に解析せずに無視します。

出典

2017-06-20 19:14:18

簡単な解決策：この問題を見つけ、他人のために

>>> y=BeautifulSoup(x) 
>>> y 
<workspaces> 
<workspace> 
<name>practice</name> 
<atom:link xmlns:atom="http://www.w3.org/2005/Atom" rel="alternate" href="https://www.my-geoserver.com/geoserver/rest/workspaces/practice.xml" type="application/xml"> 
</atom:link></workspace> 
</workspaces> 
>>> c = y.workspaces.workspace.findAll("atom:link") 
>>> c 
[<atom:link xmlns:atom="http://www.w3.org/2005/Atom" rel="alternate" href="https://www.my-geoserver.com/geoserver/rest/workspaces/practice.xml" type="application/xml"> 
</atom:link>] 
>>>

出典

2017-06-20 18:20:29

私は出力としてシンプルな[]を取得しました。resp.textの形式とは何か関係がありますが、これは私が知る限りテキストです。 y.workspaces.findAll（ "workspace"）を使用すると動作しますが、それは私が何をしたのかではありません。 –

Pythonが異常なタグ名（アトム：リンク）を持つXMLを解析します

答えて

関連する問題