如何使用JavaScript阅读包含html标记的XML文件上的Childnode内容

Question

To read a child node content I use : 要读取子节点的内容，我使用：

MYDATA = xhr.responseXML.getElementsByTagName("MenuItem")[INDEX].getElementsByTagName("PageContent")[0].childNodes[0].nodeValue;

sometimes when the childNode data contains an HTML tag (eg  or   tags), I have problems since they are counted like XML tags (like childnodes). 有时，当childNode数据包含HTML标记（例如或 标记）时，我遇到了问题，因为它们像XML标记一样被计数（例如childnodes）。

My question is how to get the entire data from a child node even if it contains other html tags 我的问题是如何从子节点获取整个数据，即使它包含其他html标签也是如此

Example: 例：

<MenuItem> 
    <MenuText>menu <b> text <b><MenuText>
</MenuItem >

would return "menu", but I want it to return: menu text  会返回“菜单”，但我希望它返回： menu text 

Answer 1

Yes, and no, depending on your parser. 是的，不是，这取决于您的解析器。 Reason for this is because all text nodes in XML are suppose to have < and > replaced with their htmlentity() counterparts, and all other special characters replaced with htmlspecialchars() . 这样做的原因是因为假定XML中的所有文本节点都将<和>替换为htmlentity() ，并将所有其他特殊字符替换为htmlspecialchars() 。 I'm fairly certain that it creates a new node, with the HTML tag as the name. 我相当确定它会创建一个以HTML标签为名称的新节点。

The only two solutions for this is to store the XML data into a string, use regex to take out the HTML tags (well, all the < and > characters for that matter), and replace them with the correct values I noted above, before you pass it to a parser ( parser.parseFromString() in javascript, given that 'parser' is a DOM parser). 唯一的两种解决方案是将XML数据存储到字符串中，使用regex取出HTML标签（好吧，所有<和>字符），然后用我上面提到的正确值替换它们。您将其传递给解析器（鉴于“ parser”是DOM解析器，因此在javascript中为parser.parseFromString() ）。）。 The other is to take the node, and then get the entire node's set of child nodes using a recursive loop, and then concatenate together their names and contents. 另一种方法是获取节点，然后使用递归循环获取整个节点的子节点集，然后将其名称和内容串联在一起。 The second method is more programming work, and more processing involved, and I suggest the simple remedy of regex and replacement of the characters. 第二种方法是更多的编程工作和更多的处理，我建议对正则表达式和字符替换进行简单的补救。

Or, you can read about CDATA here , and escape the tags instead, by placing all of the content within a ![CDATA[] tag, but that's if you're the one creating that XML file. 或者，您可以在此处阅读有关CDATA的信息，而可以通过将所有内容放在![CDATA[]标记中，而转义标记，但这就是您创建XML文件的方式。 You should notify the webmaster for the site that you got the XML from, that the XML is incorrectly created, and the tags need to be escaped with the ![CDATA[] tag, or replaced the < and > with their htmlentity() counterparts. 您应该通知网站管理员您获取XML的网站，XML的创建不正确，并且标记必须用![CDATA[]标记转义，或者将<和>替换为htmlentity() 。 I suppose that you can also use regex to place the HTML code within a ![CDATA[] tag, but that's probably slower and less efficient than replacing the < and > tags. 我想您也可以使用正则表达式将HTML代码放在![CDATA[]标记中，但是这可能比替换<和>标记更慢且效率更低。

Answer 2

The official W3C element property to return all text from an element and all it's descendants is part of DOM v3 and called textContent , but it's not supported in every browser yet (I'm looking at you IE; I think it's called innerText there) - if that is even relevant for you. W3C的官方元素属性可返回元素及其后代的所有文本，这是DOM v3的一部分，称为textContent ，但尚不支持所有浏览器（我正在用IE浏览器；我认为它在这里称为innerText）-如果那甚至与您有关。

So your line of code would look something like this for your XML snippet: 因此，对于您的XML代码段，您的代码行将如下所示：

MYDATA = xhr.responseXML.getElementsByTagName("MenuItem")[INDEX].getElementsByTagName("MenuText")[0].textContent;

That will not retain the HTML tags though. 但这不会保留HTML标记。 So in the end it depends on what you're trying to do with that XML. 因此，最终取决于您要使用该XML做什么。 Do you want to add it to another DOM tree? 是否要将其添加到另一个DOM树？ If so, you can just take that element with all it's descendants and append it elsewhere. 如果是这样，您可以将该元素及其所有后代一起添加到其他位置。

MYDATA = xhr.responseXML.getElementsByTagName("MenuItem")[INDEX].getElementsByTagName("MenuText")[0].cloneNode(true);
someOtherElement.appendChild(MYDATA);

Otherwise you'd have to write a loop that will copy each node (text content is a node, too, just like whitespace) from source to destination and append it there. 否则，您将不得不编写一个循环，以将每个节点（文本内容也是一个节点，就像空白一样）从源复制到目标，并将其附加到那里。

如何使用JavaScript阅读包含html标记的XML文件上的Childnode内容

问题描述

2 个解决方案

解决方案1
1 已采纳 2010-12-31 16:38:59

解决方案2
0 2010-12-31 17:35:16

如何使用JavaScript阅读包含html标记的XML文件上的Childnode内容

问题描述

2 个解决方案

解决方案1 1 已采纳 2010-12-31 16:38:59

解决方案2 0 2010-12-31 17:35:16

解决方案1
1 已采纳 2010-12-31 16:38:59

解决方案2
0 2010-12-31 17:35:16