2016-11-23 12 views
1

私はJsoupで新しくなっています。次の例を変更しようとしています。したがって、このjsoupはいくつかのノードに参加して要素にラップします

<div> 
    <p>text that <string>need</strong> to be <strong>wrapped</strong></p> 
    <p>a text that has to be ignored</p> 
    <p>another text that <string>need</strong> to be <strong>wrapped</strong></p> 
</div> 

を得る

<div> 
    text that <string>need</strong> to be <strong>wrapped</strong> 
    <p>a text that has to be ignored</p> 
    another text that <string>need</strong> to be <strong>wrapped</strong> 
</div> 

、私はこのような何かをしようと試みてきた<P>

と<P>の内側にはないすべてのテキストをラップする必要があります

Document doc = Jsoup.parse(html); 
doc.body().traverse(new NodeVisitor() { 
    @Override 
    public void head(Node node, int depth) { 
     if(node instanceof TextNode && Arrays.asList("div","body").contains(node.parentNode().nodeName())) { 
      Node auxNode = node; 
      node.replaceWith(pNode); 
      node.childNodes(); 

      while (auxNode.nextSibling() != null && Arrays.asList("em", "strong").contains(auxNode.nextSibling().nodeName())) { 
       node.after(auxNode); 
       auxNode.remove(); 
       auxNode = node.nextSibling(); 
      } 
      node.wrap("<p></p>"); 
     } 
    } 

    @Override 
    public void tail(Node node, int depth) { } 
}); 

しかし、私はちょうどgettin g while条件でのNullPointerException事前に

おかげでみんなに

java.lang.NullPointerException 
    at HTMLToArticleParser$1.head(HTMLToArticleParser.java:52) 
    at org.jsoup.select.NodeTraversor.traverse(NodeTraversor.java:31) 
    at org.jsoup.nodes.Node.traverse(Node.java:536) 
    at HTMLToArticleParser.parse(HTMLToArticleParser.java:47) 
    at HTMLToArticleParser_Tests.jTest(HTMLToArticleParser_Tests.java:188) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:498) 
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) 
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) 
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) 
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) 
    at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) 
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) 
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) 
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) 
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) 
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) 
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) 
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) 
    at org.junit.runners.ParentRunner.run(ParentRunner.java:309) 
    at org.junit.runner.JUnitCore.run(JUnitCore.java:160) 
    at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:117) 
    at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:42) 
    at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:262) 
    at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:84) 
+0

stacktraceにエラーメッセージを指定すると、手助けが容易になります - この情報を追加してください –

+0

どの行が52行ですか? – Hypino

答えて

1

感謝。 、私はそれがこの

クラスNewNode

public class NewNode { 

    private Element newElement = new Element(Tag.valueOf("p"), ""); 
    private List<Node> childs; 

    public NewNode(List<Node> childs) { 
     this.childs = childs; 
    } 

    public Node getNewNode() { 
     childs.forEach(child -> newElement.appendChild(child.clone())); 
     return newElement; 
    } 

} 

クラスNodesToProcess

public class NodesToProcess { 

    private Node oldNode; 
    private NewNode newNode; 
    private List<Node> toRemove; 
    public NodesToProcess(Node oldNode, NewNode newNode, List<Node> toRemove) { 
     this.oldNode = oldNode; 
     this.newNode = newNode; 
     this.toRemove = toRemove; 
    } 

    public Node getOldNode() { 
     return oldNode; 
    } 

    public Node getNewNode() { 
     return newNode.getNewNode(); 
    } 

    public List<Node> getToRemove() { 
     return toRemove; 
    } 

} 

をやって解決できると、この方法はとても

private void wrapUnwrappedTextInTagP(Element element) { 
    List<NodesToProcess> nodesToProcesses = new ArrayList<>(); 
    List<Node> nodeAlreadyUsed = new ArrayList<>(); 

    element.childNodes().forEach(node -> { 
     if(node instanceof TextNode && !nodeAlreadyUsed.contains(node)) { 
      List<Node> newChilds = new ArrayList<>(); 
      List<Node> toRemove = new ArrayList<>(); 

      newChilds.add(node); 
      nodeAlreadyUsed.add(node); 
      Node auxNode = node.nextSibling(); 

      while (auxNode != null && parentIsBodyAndIsAnTextElement(auxNode)) { 
       newChilds.add(auxNode); 
       nodeAlreadyUsed.add(auxNode); 
       toRemove.add(auxNode); 
       auxNode = auxNode.nextSibling(); 
      } 
      nodesToProcesses.add(new NodesToProcess(node, new NewNode(newChilds), toRemove)); 
     } 
    }); 

    nodesToProcesses.forEach(nodesToProcess -> { 
     nodesToProcess.getOldNode().replaceWith(nodesToProcess.getNewNode()); 
     nodesToProcess.getToRemove().forEach(node -> node.remove()); 
    }); 
} 

に包まれていないテキストを折り返すものですメインメソッドで

Document doc = Jsoup.parse(html); 
wrapUnwrappedTextInTagP(doc.body()); 
関連する問題