简体   繁体   English

如何使用 JavaScript 将 HTML 转换为 XHTML?

[英]How do I convert HTML to XHTML using JavaScript?

I need to add slashes to the end of all the image tags in a string.我需要在字符串中所有图像标签的末尾添加斜杠。 I'm using JavaScript regular expressions.我正在使用 JavaScript 正则表达式。 Here is what I have so far:这是我到目前为止所拥有的:

strInput = strInput.replace(/<img.*">/gm, "");

But I'm not sure what to replace it with?但我不知道用什么来代替它? I'm taking the value of a text area and parsing it as XML, but the image tags generate errors because they're HTML.我正在获取文本区域的值并将其解析为 XML,但图像标记会生成错误,因为它们是 HTML。 Thanks.谢谢。

you should let the browser do the 'heavy lifting';你应该让浏览器做“繁重的工作”; obviously, the browser can parse HTML - after all, how else should it show us web pages?显然,浏览器可以解析 HTML - 毕竟,它应该如何向我们显示网页? You can use JavaScript to make the browser parse HTML for you by setting .innerHTML of some dom node to your HTML string, or by using .insertAdjacentHTML .您可以使用 JavaScript 通过将某些 dom 节点的.innerHTML设置为您的 HTML 字符串,或使用.insertAdjacentHTML来让浏览器为您解析 HTML。 Then you have transformed your HTML string to a tree of DOM nodes, ie, you have it parsed.然后,您已将 HTML 字符串转换为 DOM 节点树,即已对其进行解析。

And there are browser builtin ways to turn your DOM tree into an XHTML string.还有一些浏览器内置方法可以将 DOM 树转换为 XHTML 字符串。 You simply create an XHTML document programmatically, then you add any DOM tree to it (which can come from an HTML (non-XHTML) document, that is perfectly fine) with .appendChild , and then the .outerHTML and .innerHTML methods of your DOM tree (which now have an XHTML document as owner document) will give XHTML.您只需以编程方式创建一个 XHTML 文档,然后使用.appendChild向其中添加任何 DOM 树(它可以来自 HTML(非 XHTML)文档,这非常好),然后是您的.outerHTML.innerHTML方法。 DOM 树(现在有一个 XHTML 文档作为所有者文档)将提供 XHTML。

If you're starting with a DOM node, you can use the following 2 functions:如果您从 DOM 节点开始,您可以使用以下 2 个函数:

var nsx = "http://www.w3.org/1999/xhtml";
function outerXHTML(node){
    var xdoc = document.implementation.createDocument(nsx, 'html');
    xdoc.documentElement.appendChild(node);
    return node.outerHTML;
}
function innerXHTML(node){
    var xdoc = document.implementation.createDocument(nsx, 'html');
    xdoc.documentElement.appendChild(node);
    return node.innerHTML;
}

(note that the node will be owned by the newly created XHTML document, so it will vanish from your original document. If it should remain there, then clone it before calling one of the above functions.) (请注意,该节点将归新创建的 XHTML 文档所有,因此它将从原始文档中消失。如果它应该保留在那里,则在调用上述函数之一之前克隆它。)

And if you're starting with a string, we'll just have to set innerHTML of a newly created node before calling the above.如果你从一个字符串开始,我们只需要在调用上面的之前设置一个新创建的节点的innerHTML。 For you convenience, here is a snippet.为方便起见,这里是一个片段。 With 3 examples.用3个例子。 2 for html to xhtml, and one for xhtml to html. 2 个用于 html 到 xhtml,一个用于 xhtml 到 html。

 function html2xhtml(html){ var nsx = "http://www.w3.org/1999/xhtml"; var body = document.createElement('body'); body.innerHTML = html; var xdoc = document.implementation.createDocument(nsx, 'html'); xdoc.documentElement.appendChild(body); return body.innerHTML; } function xhtml2html(xhtml){ var body = document.createElement('body'); body.innerHTML = xhtml; var doc = document.implementation.createHTMLDocument(); doc.documentElement.appendChild(body); return body.innerHTML; } var html1 = '<div>lorem<img>ipsum<img>dolor sit amet<br></div>'; var html2 = '<ul><li><svg><rect width="100" height="100"></rect></svg></li></ul>'; var html3x = '<img />'; var node1 = document.getElementById('node1'); var node1x = document.getElementById('node1x'); var node2 = document.getElementById('node2'); var node2x = document.getElementById('node2x'); var node3 = document.getElementById('node3'); var node3x = document.getElementById('node3x'); node1.textContent = html1; node2.textContent = html2; node3x.textContent = html3x; node1x.textContent = html2xhtml(html1); node2x.textContent = html2xhtml(html2); node3.textContent = xhtml2html(html3x);
 html<br><pre id='node1'></pre>xhtml<br><pre id='node1x'></pre><hr> html<br><pre id='node2'></pre>xhtml<br><pre id='node2x'></pre><hr><hr> xhtml<br><pre id='node3x'></pre>html<br><pre id='node3'></pre>

code older version旧版本的代码

you can also do it with XMLSerializer (for the toString part not the fromString part), credit @Kaiido.您也可以使用 XMLSerializer (对于toString部分而不是fromString 部分)来做到这一点,感谢@Kaiido。

You'll have to use a capture group:您必须使用捕获组:

strInput = strInput.replace(/(<img[^>]+)>/gm, "$1 />");

Here's the fiddle: http://jsfiddle.net/ChNnU/这是小提琴: http : //jsfiddle.net/ChNnU/

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM