JavaでRegexを使用してHTTP XML応答を解析する

私はAPIコールを作成していますが、レスポンスから特定のデータを取得する必要があります。私は、以下の場合には、私はすでにこれを行うことにより、単一のタグを取得からデータを取得するためのメソッドを作成している110107.JavaでRegexを使用してHTTP XML応答を解析する

ある「説明」請求書、ため文書IDを取得する必要がしています：

public synchronized String getTagFromHTTPResponseAsString(String tag, String body) throws IOException { 

    final Pattern pattern = Pattern.compile("<"+tag+">(.+?)</"+tag+">"); 
    final Matcher matcher = pattern.matcher(body); 
    matcher.find(); 

    return matcher.group(1); 

} // end getTagFromHTTPResponseAsString

しかし、私の問題は、この結果セットであり、そこに同じタグを持つ複数のフィールドがあり、私は特定のものを必要としています。ここでは、応答は次のとおりです。

<?xml version="1.0" encoding="utf-8"?> 
<Order TrackingID="351535" TrackingNumber="TEST-843245" xmlns=""> 
    <ErrorMessage /> 
    <StatusDocuments> 
    <StatusDocument NUM="1"> 
     <DocumentDate>7/14/2017 6:52:00 AM</DocumentDate> 
     <FileName>4215.pdf</FileName> 
     <Type>Sales Contract</Type> 
     <Description>Uploaded Document</Description> 
     <DocumentID>110098</DocumentID> 
     <DocumentPlaceHolder /> 
    </StatusDocument> 
    <StatusDocument NUM="2"> 
     <DocumentDate>7/14/2017 6:52:00 AM</DocumentDate> 
     <FileName>Apex_Shortcuts.pdf</FileName> 
     <Type>Other</Type> 
     <Description>Uploaded Document</Description> 
     <DocumentID>110100</DocumentID> 
     <DocumentPlaceHolder /> 
    </StatusDocument> 
    <StatusDocument NUM="3"> 
     <DocumentDate>7/14/2017 6:52:00 AM</DocumentDate> 
     <FileName>CRAddend.pdf</FileName> 
     <Type>Other</Type> 
     <Description>Uploaded Document</Description> 
     <DocumentID>110104</DocumentID> 
     <DocumentPlaceHolder /> 
    </StatusDocument> 
    <StatusDocument NUM="4"> 
     <DocumentDate>7/14/2017 6:52:00 AM</DocumentDate> 
     <FileName>test.pdf</FileName> 
     <Type>Other</Type> 
     <Description>Uploaded Document</Description> 
     <DocumentID>110102</DocumentID> 
     <DocumentPlaceHolder /> 
    </StatusDocument> 
    <StatusDocument NUM="5"> 
     <DocumentDate>7/14/2017 6:55:00 AM</DocumentDate> 
     <FileName>Invoice.pdf</FileName> 
     <Type>Invoice</Type> 
     <Description>Invoice</Description> 
     <DocumentID>110107</DocumentID> 
     <DocumentPlaceHolder /> 
    </StatusDocument> 
    </StatusDocuments> 
</Order>

私はhttps://regex101.com/の私の正規表現を作成し、テストしてみましたし、この正規表現が動作するようになったが、私はそれが私のJavaコードに正しくオーバー翻訳するために取得することはできません。

<Description>Invoice<\/Description> 
     <DocumentID>(.*?)<\/DocumentID>

出典

2017-07-14 Dustin N.

正規表現を使用してXMLを解析しないでください。 XMLパーサを使用します。 – jsheeran

正規表現は文字列照合用であり、XML解析用ではありません。私は、多くのxml解析ライブラリの1つを使用することをお勧めします。さらに、私の経験では、Regexは使い方や保守が面倒かもしれません。 – MartinByers

はJsoup

例でそれを試してみてください。

import org.jsoup.Jsoup; 
import org.jsoup.nodes.Document; 
import org.jsoup.nodes.Element; 
import org.jsoup.select.Elements; 

public class sssaa { 
    public static void main(String[] args) throws Exception { 
     String xml = "yourXML";   
     Document doc = Jsoup.parse(xml); 
     Elements StatusDocuments = doc.select("StatusDocument"); 
     for(Element e : StatusDocuments){ 
      if(e.select("Description").text().equals("Invoice")){ 
       System.out.println(e.select("DocumentID").text()); 
      }   
     } 
    } 
}

出典

2017-07-14 12:55:06 Eritrean

これはおそらく行くための最善の方法ではないよう

// Create the pattern and matcher 
Pattern p = Pattern.compile("<Description>Invoice<\\/Description><DocumentID>(.*)<\\/DocumentID>"); 
Matcher m = p.matcher(responseText); 

// if an occurrence if a pattern was found in a given string... 
if (m.find()) { 
    // ...then you can use group() methods. 
    System.out.println("group0 = " + m.group(0)); // whole matched expression 
    System.out.println("group1 = " + m.group(1)); // first expression from round brackets (Testing) 
} 

// Set the documentID for the Invoice 
documentID = m.group(1);

はルックス：

私はこの問題を解決するために行っていることは、単一の文字列に応答を変換するためのStringBuilderを使用して、文書IDを取得するには、コードのこの部分を使用していますこれを行うが、今のところ働いている。私はここに戻って、ここで与えられた提案からより正確な解決策でこれをきれいにしようとします。

出典

2017-07-14 13:15:03

@Eritreanの回答は素晴らしい作品で、きれいです。私はそのソリューションを実装しています –

JavaでRegexを使用してHTTP XML応答を解析する

答えて

関連する問題