Python 3のノード属性に基づくXML要素テキストの抽出

XMLファイルからさまざまな情報を抽出しようとしています。部分的に情報を収集することができますが、いくつかのノードの名前が同じで、子ノード内の属性名に基づいて抽出する必要があるため、要素ツリーを歩くのが困難です。ここでPython 3のノード属性に基づくXML要素テキストの抽出

はXMLスニペットです：

<parent_element> 
    <test_case name="00001d" run="1"> 
     <properties> 
     <item name="ID">001029d</item> 
     <item name="Test Description">Usefull Information</item> 
     </properties> 
     <runner> 
     <iteration name="First Iter"/> 
     <iteration name="1"> 
      <inputs> 
      <input name="FirstInput">005546</input> 
      </inputs> 
      <step name="Setup"/> 
      <step name="1"> 
      <action actual="0x00 01 1E" msg="INTERESTING.TAG" type="SET"/> 
      <action actual="9" msg="TAG.LENGTH" type="SET"/> 
      <action actual="10 10 01" msg="SOMETHING.RESULT" type="SET"/> 
      <action actual="11 10 01" msg="OTHER.TYPE" type="SET"/> 
      </step> 
      <step name="ENDING"/> 
     </iteration> 
     <iteration name="TEST_END"/> 
     </runner> 
    </test_case> 
</parent_element>

そしてここでは、これまでの私のコードです：

import lxml.etree as et 

doc = et.parse('C:/temp/report.xml') 

testTags = doc.xpath('//test_case') 
actionTags = doc.xpath('//test_case/runner/iteration/step') 

for test in testTags: 
    propertyTags = test.xpath('properties') 
    for prop in propertyTags: 
     print (prop[0].text) 
     print(prop[1].text) 
    for action in actionTags: 
     #print(action[0].text) 
     #print(action[1].text)

私は簡単にプロパティのテキストをつかむことができます（「001029d」と「お役立ち情報」）が、私は持っています私が上記のコードでどこに詰まっているのかコメントしました。

「msg = '属性（例：「INTERESTING.TAG」）への参照を作成し、「実際の=」属性テキスト（例：0x00 01 1E）を抽出する必要があります。それとも、はるかに効率的な方法がありますか？

ご協力いただきありがとうございます。

出典

2016-06-20 MikG

はこれを試してみてください。

import lxml.etree as et 

doc = et.parse('C:/temp/report.xml') 

testTags = doc.xpath('//test_case') 

for test in testTags: 
    propertyTags = test.xpath('properties') 
    actionTags = test.xpath('/runner/iteration/step/action') 
    for prop in propertyTags: 
     print (prop[0].text) 
     print(prop[1].text) 
    for action in actionTags: 
    msg= action.attrib.get('msg') 
    actual = action.attrib.get('actual') 
    if msg=="INTERESTING.TAG" or msg=="SOMETHING.RESULT" : 
     print(msg) 
     print(actual)

警告が私のpythonを知らない、構文のapolgiesが正しくありません。

あなたも理想的な、私はそこに可能性として最初の「MSG =」属性を持ついくつかのチェックを行う必要があり、変更にactionTagsの大きさをフィルタリングするために、XPath、

actionTags = test.xpath('/runner/iteration/step/action[@msg="INTERESTING.TAG" or @msg="SOMETHING.RESULT"]') 
    for prop in propertyTags: 
     print (prop[0].text) 
     print(prop[1].text) 
    for action in actionTags: 
    msg= action.attrib.get('msg') 
    actual = action.attrib.get('actual') 
    print(msg) 
    print(actual)

出典

2016-06-20 14:07:50

こんにちはフィル・Bを含むことができ50+ 'actual ='属性のテキストを抽出し、さらにアクションノードが常に 'action [x]'の位置にあるとは限りません。 – MikG

常に同じcontast msg値を探していますか？ –

はい、私はmsg = "INTERESTING.TAG"とmsg = "SOMETHING.RESULT"の両方をそれぞれ 'test_case'で探します。上記の例では、私は '0x00 01 1E'と'10 10 01 ' – MikG

Python 3のノード属性に基づくXML要素テキストの抽出

答えて

関連する問題