如何从 JDOM 获取节点内容

2023-01-13Java开发问题
26

本文介绍了如何从 JDOM 获取节点内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

我正在使用 import org.jdom.* 编写一个 java 应用程序;

I'm writing an application in java using import org.jdom.*;

我的 XML 是有效的,但有时它包含 HTML 标记.例如,像这样:

My XML is valid,but sometimes it contains HTML tags. For example, something like this:

  <program-title>Anatomy &amp; Physiology</program-title>
  <overview>
       <content>
              For more info click <a href="page.html">here</a>
              <p>Learn more about the human body.  Choose from a variety of Physiology (A&amp;P) designed for complementary therapies.&amp;#160; Online studies options are available.</p>
       </content>
  </overview>
  <key-information>
     <category>Health &amp; Human Services</category>

所以我的问题在于 <p > overview.content 节点内的标签.

So my problem is with the < p > tags inside the overview.content node.

我希望这段代码可以工作:

I was hoping that this code would work :

        Element overview = sds.getChild("overview");
        Element content = overview.getChild("content");

        System.out.println(content.getText());

但它返回空白.

如何从 overview.content 节点返回所有文本(嵌套标签和所有)?

How do I return all the text ( nested tags and all ) from the overview.content node ?

谢谢

推荐答案

content.getText() 提供即时文本,该文本仅对带有文本内容的叶子元素有用.

content.getText() gives immediate text which is only useful fine with the leaf elements with text content.

技巧是使用 org.jdom.output.XMLOutputter (带文本模式 CompactFormat )

Trick is to use org.jdom.output.XMLOutputter ( with text mode CompactFormat )

public static void main(String[] args) throws Exception {
    SAXBuilder builder = new SAXBuilder();
    String xmlFileName = "a.xml";
    Document doc = builder.build(xmlFileName);

    Element root = doc.getRootElement();
    Element overview = root.getChild("overview");
    Element content = overview.getChild("content");

    XMLOutputter outp = new XMLOutputter();

    outp.setFormat(Format.getCompactFormat());
    //outp.setFormat(Format.getRawFormat());
    //outp.setFormat(Format.getPrettyFormat());
    //outp.getFormat().setTextMode(Format.TextMode.PRESERVE);

    StringWriter sw = new StringWriter();
    outp.output(content.getContent(), sw);
    StringBuffer sb = sw.getBuffer();
    System.out.println(sb.toString());
}

输出

For more info click<a href="page.html">here</a><p>Learn more about the human body. Choose from a variety of Physiology (A&amp;P) designed for complementary therapies.&amp;#160; Online studies options are available.</p>

请探索其他 格式化 选项并在上面进行修改根据您的需要编写代码.

Do explore other formatting options and modify above code to your need.

封装XMLOutputter格式选项的类.典型用户可以使用getRawFormat()(不改变空白)、getPrettyFormat()(空白美化)、getCompactFormat()(空白归一化)得到的标准格式配置."

"Class to encapsulate XMLOutputter format options. Typical users can use the standard format configurations obtained by getRawFormat() (no whitespace changes), getPrettyFormat() (whitespace beautification), and getCompactFormat() (whitespace normalization). "

这篇关于如何从 JDOM 获取节点内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

The End

相关推荐

如何使用 JAVA 向 COM PORT 发送数据?
How to send data to COM PORT using JAVA?(如何使用 JAVA 向 COM PORT 发送数据?)...
2024-08-25 Java开发问题
21

如何使报表页面方向更改为“rtl"?
How to make a report page direction to change to quot;rtlquot;?(如何使报表页面方向更改为“rtl?)...
2024-08-25 Java开发问题
19

在 Eclipse 项目中使用西里尔文 .properties 文件
Use cyrillic .properties file in eclipse project(在 Eclipse 项目中使用西里尔文 .properties 文件)...
2024-08-25 Java开发问题
18

有没有办法在 Java 中检测 RTL 语言?
Is there any way to detect an RTL language in Java?(有没有办法在 Java 中检测 RTL 语言?)...
2024-08-25 Java开发问题
11

如何在 Java 中从 DB 加载资源包消息?
How to load resource bundle messages from DB in Java?(如何在 Java 中从 DB 加载资源包消息?)...
2024-08-25 Java开发问题
13

如何更改 Java 中的默认语言环境设置以使其保持一致?
How do I change the default locale settings in Java to make them consistent?(如何更改 Java 中的默认语言环境设置以使其保持一致?)...
2024-08-25 Java开发问题
13