简体   繁体   English

无法将xml字符串转换为w3c doc

[英]Can't convert xml string to w3c doc

I want to convert an java string containing xml to a w3c dom document object. 我想将包含xml的java字符串转换为w3c dom文档对象。

I first searched all over the place and came up with some good examples here on stackoverflow. 我首先搜遍了整个地方,并在stackoverflow上找到了一些很好的例子。 But sadly I can get them working! 但遗憾的是我可以让他们工作!

Apperently my code is not working 100%. 显然我的代码无法100%工作。

It seems like it parses the string but there are no values in the nodes. 它似乎解析了字符串,但节点中没有值。 This is what I got so far! 这是我到目前为止所得到的!

Document newDoc = null;

InputSource is = new InputSource();
is.setCharacterStream(new StringReader(TestFiles.RSS_FEED_FILE_2));

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
DocumentBuilder builder = null;
builder = factory.newDocumentBuilder();
newDoc = builder.parse(is);

When I do a sysout afterwards like this: 当我之后做这样的sysout时:

System.out.println(newDoc.getDocumentElement().getElementsByTagName("channel").item(0)
.getNodeValue());

I got null as output while using this sysout: 使用此sysout时,我输出为null:

System.out.println(newDoc.getDocumentElement().getElementsByTagName("channel").item(0));

I got as output: [channel: null] 我得到的输出:[channel:null]

So I have an object else it would throw some null pointer exceptions but it doesn't contain any values inside ?! 所以我有一个对象,否则会抛出一些空指针异常,但它不包含任何值?!

The content of the constant is this : 常量的内容是这样的:

public final static String RSS_FEED_FILE_2 =    "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" + 
                                            "<rss version=\"2.0\">\n" + 
                                            "<channel>\n" + 
                                            "<title>sunday</title>\n" + 
                                            "<link>http://www.google.nl</link>\n" + 
                                            "<pubDate>2012-02-05 20:58</pubDate>\n" + 
                                            "<lastBuildDate>2012-02-08 09:48</lastBuildDate>\n" + 
                                            "<description>blabla </description>\n" + 
                                            "<item>\n" + 
                                            "<title><![CDATA[title]]></title>\n" + 
                                            "<link><![CDATA[http://www.google.nl]]></link>\n" + 
                                            "<guid><![CDATA[2266610]]></guid>\n" + 
                                            "<source><![CDATA[sunday]]></source>\n" + 
                                            "<author><![CDATA[me]]></author>\n" + 
                                            "<description><![CDATA[blalbalavblabllllll!]]></description>\n" + 
                                            "</item>\n" + 
                                            "</channel>\n" + 
                                            "</rss>";

Does anybody have a solution or a hint? 有人有解决方案或暗示吗?

This is quite a common gotcha. 这是一个非常普遍的问题。 The behaviour of getNodeValue() depends on the subclass of Node . getNodeValue()的行为取决于Node的子类。 In the case of an Element , getNodeValue() will always return null (see the table in the Node javadoc for behaviour of other subclasses). 对于ElementgetNodeValue()始终返回null (有关其他子类的行为,请参见Node javadoc中的表)。

Consider using getTextContent() if you want to debug the XML document. 如果要调试XML文档,请考虑使用getTextContent()

As you are trying to load an RSS XML string, I can suggest you to use RSS XSD from http://www.thearchitect.co.uk/schemas/rss-2_0.xsd . 当您尝试加载RSS XML字符串时,我建议您使用http://www.thearchitect.co.uk/schemas/rss-2_0.xsd中的 RSS XSD。 This will help you in loading the RSS string and giving you a better way to either edit data or transform it to any destinations like file. 这将帮助您加载RSS字符串,并为您提供更好的方法来编辑数据或将其转换为任何目标,如文件。 This will need JAXB to work although. 这需要JAXB才能工作。 Hope this helps. 希望这可以帮助。

Using jdom takes a lot of pain of out processing XML, and it is usually my first port of call. 使用jdom会花费很多时间处理XML,这通常是我的第一站电话。

If using jdom is an option, then building the document is trivial. 如果使用jdom是一个选项,那么构建文档是微不足道的。

SAXBuilder builder = new SAXBuiler();
Document doc = builder.build(new StringReader(YOUR_XML_STRING));

The thing to be careful of is that this creates an org.jdom.Document object, which you then need to adapt in to a w3c document. 需要注意的是,这会创建一个org.jdom.Document对象,然后您需要将其调整为w3c文档。 Again this is quite easily achieved with the org.jdom.output.DOMOutputter class. 同样,使用org.jdom.output.DOMOutputter类很容易实现这一点。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM