如何从 JDOM 获取节点内容

How to get node contents from JDOM(如何从 JDOM 获取节点内容)
本文介绍了如何从 JDOM 获取节点内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

我正在使用 import org.jdom.* 编写一个 java 应用程序;

I'm writing an application in java using import org.jdom.*;

我的 XML 是有效的,但有时它包含 HTML 标记.例如,像这样:

My XML is valid,but sometimes it contains HTML tags. For example, something like this:

  <program-title>Anatomy &amp; Physiology</program-title>
  <overview>
       <content>
              For more info click <a href="page.html">here</a>
              <p>Learn more about the human body.  Choose from a variety of Physiology (A&amp;P) designed for complementary therapies.&amp;#160; Online studies options are available.</p>
       </content>
  </overview>
  <key-information>
     <category>Health &amp; Human Services</category>

所以我的问题在于 <p > overview.content 节点内的标签.

So my problem is with the < p > tags inside the overview.content node.

我希望这段代码可以工作:

I was hoping that this code would work :

        Element overview = sds.getChild("overview");
        Element content = overview.getChild("content");

        System.out.println(content.getText());

但它返回空白.

如何从 overview.content 节点返回所有文本(嵌套标签和所有)?

How do I return all the text ( nested tags and all ) from the overview.content node ?

谢谢

推荐答案

content.getText() 提供即时文本,该文本仅对带有文本内容的叶子元素有用.

content.getText() gives immediate text which is only useful fine with the leaf elements with text content.

技巧是使用 org.jdom.output.XMLOutputter (带文本模式 CompactFormat )

Trick is to use org.jdom.output.XMLOutputter ( with text mode CompactFormat )

public static void main(String[] args) throws Exception {
    SAXBuilder builder = new SAXBuilder();
    String xmlFileName = "a.xml";
    Document doc = builder.build(xmlFileName);

    Element root = doc.getRootElement();
    Element overview = root.getChild("overview");
    Element content = overview.getChild("content");

    XMLOutputter outp = new XMLOutputter();

    outp.setFormat(Format.getCompactFormat());
    //outp.setFormat(Format.getRawFormat());
    //outp.setFormat(Format.getPrettyFormat());
    //outp.getFormat().setTextMode(Format.TextMode.PRESERVE);

    StringWriter sw = new StringWriter();
    outp.output(content.getContent(), sw);
    StringBuffer sb = sw.getBuffer();
    System.out.println(sb.toString());
}

输出

For more info click<a href="page.html">here</a><p>Learn more about the human body. Choose from a variety of Physiology (A&amp;P) designed for complementary therapies.&amp;#160; Online studies options are available.</p>

请探索其他 格式化 选项并在上面进行修改根据您的需要编写代码.

Do explore other formatting options and modify above code to your need.

封装XMLOutputter格式选项的类.典型用户可以使用getRawFormat()(不改变空白)、getPrettyFormat()(空白美化)、getCompactFormat()(空白归一化)得到的标准格式配置."

"Class to encapsulate XMLOutputter format options. Typical users can use the standard format configurations obtained by getRawFormat() (no whitespace changes), getPrettyFormat() (whitespace beautification), and getCompactFormat() (whitespace normalization). "

这篇关于如何从 JDOM 获取节点内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

相关文档推荐

How to send data to COM PORT using JAVA?(如何使用 JAVA 向 COM PORT 发送数据?)
How to make a report page direction to change to quot;rtlquot;?(如何使报表页面方向更改为“rtl?)
Use cyrillic .properties file in eclipse project(在 Eclipse 项目中使用西里尔文 .properties 文件)
Is there any way to detect an RTL language in Java?(有没有办法在 Java 中检测 RTL 语言?)
How to load resource bundle messages from DB in Java?(如何在 Java 中从 DB 加载资源包消息?)
How do I change the default locale settings in Java to make them consistent?(如何更改 Java 中的默认语言环境设置以使其保持一致?)