Java XML JDOM2 XPath - XPath式を使用してXML属性および要素からテキスト値を読み取る

プログラムはXPath式を使用してXMLファイルから読み取ることが許可されている必要があります。すでにJDOM2を使用してプロジェクトを開始しましたが、別のAPIに切り替えることは望ましくありません。難易度は、要素または属性を読み取る必要があるかどうかをプログラムが事前に知っていないことです。 APIは、XPath式を与えるだけでコンテンツ（文字列）を受け取る関数を提供していますか？ JDOM2のXPathについて知っているところから、さまざまな型のオブジェクトを使用して、属性または要素を指すXPath式を評価します。私はXPath式が指し示す属性/要素の内容のみに興味があります。ここでJava XML JDOM2 XPath - XPath式を使用してXML属性および要素からテキスト値を読み取る

は、例えばXMLファイルです：

<?xml version="1.0" encoding="UTF-8"?> 
<bookstore> 
    <book category="COOKING"> 
    <title lang="en">Everyday Italian</title> 
    <author>Giada De Laurentiis</author> 
    <year>2005</year> 
    <price>30.00</price> 
    </book> 
    <book category="CHILDREN"> 
    <title lang="en">Harry Potter</title> 
    <author>J K. Rowling</author> 
    <year>2005</year> 
    <price>29.99</price> 
    </book> 
    <book category="WEB"> 
    <title lang="en">XQuery Kick Start</title> 
    <author>James McGovern</author> 
    <author>Per Bothner</author> 
    <author>Kurt Cagle</author> 
    <author>James Linn</author> 
    <author>Vaidyanathan Nagarajan</author> 
    <year>2003</year> 
    <price>49.99</price> 
    </book> 
    <book category="WEB"> 
    <title lang="en">Learning XML</title> 
    <author>Erik T. Ray</author> 
    <year>2003</year> 
    <price>39.95</price> 
    </book> 
</bookstore>

これは以下のように私のプログラムが見えるものです：もう一つの考えはに、XPath式に「@」文字を検索することです

package exampleprojectgroup; 

import java.io.IOException; 
import java.util.LinkedList; 
import java.util.List; 
import org.jdom2.Attribute; 
import org.jdom2.Document; 
import org.jdom2.Element; 
import org.jdom2.JDOMException; 
import org.jdom2.filter.Filters; 
import org.jdom2.input.SAXBuilder; 
import org.jdom2.input.sax.XMLReaders; 
import org.jdom2.xpath.XPathExpression; 
import org.jdom2.xpath.XPathFactory; 


public class ElementAttribute2String 
{ 
    ElementAttribute2String() 
    { 
     run(); 
    } 

    public void run() 
    { 
     final String PATH_TO_FILE = "c:\\readme.xml"; 
     /* It is essential that the program has to work with a variable amount of XPath expressions. */ 
     LinkedList<String> xPathExpressions = new LinkedList<>(); 
     /* Simulate user input. 
     * First XPath expression points to attribute, 
     * second one points to element. 
     * Many more expressions follow in a real situation. 
     */ 
     xPathExpressions.add("/bookstore/book/@category"); 
     xPathExpressions.add("/bookstore/book/price"); 

     /* One list should be sufficient to store the result. */ 
     List<Element> elementsResult = null; 
     List<Attribute> attributesResult = null; 
     List<Object> objectsResult = null; 
     try 
     { 
      SAXBuilder saxBuilder = new SAXBuilder(XMLReaders.NONVALIDATING); 
      Document document = saxBuilder.build(PATH_TO_FILE); 
      XPathFactory xPathFactory = XPathFactory.instance(); 
      int i = 0; 
      for (String string : xPathExpressions) 
      { 
       /* Works only for elements, uncomment to give it a try. */ 
//    XPathExpression<Element> xPathToElement = xPathFactory.compile(xPathExpressions.get(i), Filters.element()); 
//    elementsResult = xPathToElement.evaluate(document); 
//    for (Element element : elementsResult) 
//    { 
//     System.out.println("Content of " + string + ": " + element.getText()); 
//    } 

       /* Works only for attributes, uncomment to give it a try. */ 
//    XPathExpression<Attribute> xPathToAttribute = xPathFactory.compile(xPathExpressions.get(i), Filters.attribute()); 
//    attributesResult = xPathToAttribute.evaluate(document); 
//    for (Attribute attribute : attributesResult) 
//    { 
//     System.out.println("Content of " + string + ": " + attribute.getValue()); 
//    } 

       /* I want to receive the content of the XPath expression as a string 
       * without having to know if it is an attribute or element beforehand. 
       */ 
       XPathExpression<Object> xPathExpression = xPathFactory.compile(xPathExpressions.get(i)); 
       objectsResult = xPathExpression.evaluate(document); 
       for (Object object : objectsResult) 
       { 
        if (object instanceof Attribute) 
        { 
         System.out.println("Content of " + string + ": " + ((Attribute)object).getValue()); 
        } 
        else if (object instanceof Element) 
        { 
         System.out.println("Content of " + string + ": " + ((Element)object).getText()); 
        } 
       } 
       i++; 
      } 
     } 
     catch (IOException ioException) 
     { 
      ioException.printStackTrace(); 
     } 
     catch (JDOMException jdomException) 
     { 
      jdomException.printStackTrace(); 
     } 
    } 
}

それが属性または要素を指しているかどうかを判断する。これは私に望ましい結果をもたらしますが、私はより洗練された解決策があることを望みます。 JDOM2 APIはこの問題に役立つものは何ですか？要件を満たすためにコードを再設計できますか？

ありがとうございます！

出典

2016-10-20 Stefan

XPath式は、式に含まれるXPath関数/値の戻り値の型に敏感なシステムでコンパイルする必要があるため、型キャストするのは難しいです。 JDOMはサードパーティのコードに依存しており、サードパーティのコードには、JDOMコードのコンパイル時にこれらのタイプを関連付ける仕組みがありません。 XPath式は、String、boolean、Number、Node-Listのようなコンテンツを含むさまざまな種類のコンテンツを返すことができます。

ほとんどの場合、式が評価される前にXPath式の戻り値型がわかっており、プログラマは結果を処理するための "正しい"キャスト/期待値を持っています。

あなたの場合、あなたはそうではなく、表現はより動的です。

私はあなたがコンテンツを処理するためのヘルパー関数を宣言することをお勧めします。

private static final Function extractValue(Object source) { 
    if (source instanceof Attribute) { 
     return ((Attribute)source).getValue(); 
    } 
    if (source instanceof Content) { 
     return ((Content)source).getValue(); 
    } 
    return String.valueOf(source); 
}

これは、少なくともあなたのコードをneatenなり、あなたが使用している場合Java8ストリーム、非常にコンパクトにすることができる。

List<String> values = xPathExpression.evaluate(document) 
         .stream() 
         .map(o -> extractValue(o)) 
         .collect(Collectors.toList());

要素ノードのXPath仕様は、string-valueが要素のtext()コンテンツとすべての子要素のコンテンツの連結であることに注意してください。したがって、次のXMLスニペットで：a要素の

<a>bilbo <b>samwise</b> frodo</a>

getValue()はbilbo samwise frodoを返しますが、getText()はbilbo frodoを返します。値抽出に使用するメカニズムを慎重に選択します。

出典

2016-10-20 13:25:32 rolfl

JDOM2の 'Attribute'は' Content'のサブクラスですか？ http://www.jdom.org/docs/apidocs/org/jdom2/Attribute.htmlには表示されていないので、私の答えが「XPathExpression xPathExpression = xPathFactory.compile（xPathExpressions.get（i））、Filters.content（）） 'は要素と属性を扱います。 –

ああ...うんざり。私は属性が内容ではないことを忘れていました。それは 'getValue（）'メソッドを持ち、私は仮定しました。これについて少し考えてみましょう。 – rolfl

私は、曖昧なXPath結果を検査する以外にも、より良いやり方を考えることはできません。 ElementノードとAttributeノードの両方が共通の祖先を共有している場合、JDOMはやや簡単に作業を進めることができましたが、それが実現不可能な理由は他にもあります。私はOPで記述された基本的なメカニズムを変更するのではなく、コードを整理するために関数抽出を推奨する答えを編集しました。 – rolfl

私はまったく同じ問題を抱えていました。属性がXpathの焦点であることを認識するというアプローチを取っていました。私は2つの機能で解決しました。第二評価さ

XPathExpression xpExpression; 
    if (xpath.matches( ".*/@[\\w]++$")) { 
     // must be an attribute value we're after.. 
     xpExpression = xpfac.compile(xpath, Filters.attribute(), null, myNSpace); 
    } else { 
     xpExpression = xpfac.compile(xpath, Filters.element(), null, myNSpace); 
    }

と値を返します：最初の後に使用するためにXPathExpressionを遵守し

Object target = xpExpression.evaluateFirst(baseEl); 
if (target != null) { 
    String value = null; 
    if (target instanceof Element) { 
     Element targetEl = (Element) target; 
     value = targetEl.getTextNormalize(); 
    } else if (target instanceof Attribute) { 
     Attribute targetAt = (Attribute) target; 
     value = targetAt.getValue(); 
    }

私はそのあなたがヘルパー関数は、前の回答で提案されていることを好むかどうかのコーディングスタイルの問題を疑いますまたはこのアプローチ。どちらもうまくいく。

出典

2017-01-18 21:55:37

Java XML JDOM2 XPath - XPath式を使用してXML属性および要素からテキスト値を読み取る

答えて

関連する問題