如何使用JavaScript使用HTML从字符串中删除整个HTML，HEAD标签和BODY标签？

Question

I have a template file that is called myWebsite.html. 我有一个名为myWebsite.html的模板文件。 It contains everything that HTML template needs to have. 它包含HTML模板所需的所有内容。 So it has HTML, HEAD and BODY tags. 因此它具有HTML，HEAD和BODY标签。 I want to load it with JavaScript and put into one of divs on the site. 我想用JavaScript加载它并放入站点中的div之一。 So i don't want to have the HTML, HEAD and BODY tags. 所以我不想有HTML，HEAD和BODY标签。 How to do this? 这个怎么做？

This is a prototype of what i need to have: 这是我需要具备的原型：

$val = getData('myWebsite.html');
$val = removeHTMLHEADBODYTAGS($val); //remove these tags with everything insite, also remove the body tag but leave the contents in the body tag. Also remove the end tags of body and html - HOW TO DO THIS?
div.innerHTML = $val;

I want to do this in pure JavaScript = NO jQUERY 我想用纯JavaScript = NO jQUERY做到这一点

Answer 1

Why not fetch the information out of the tag and then work with that? 为什么不从标签中取出信息，然后使用它呢？ There is no need to fetch all information and the removing html, head and body: 无需获取所有信息，也无需删除html，head和body：

content = $val.getElementsByTagName('body')[0].innerHTML();

Answer 2

You could extract it with a regex. 您可以使用正则表达式将其提取。 Something like: /\\<body[^>]*\\>(.*)\\<\\/body/m - that should return all content within the <BODY> element. 类似于：/ /\\<body[^>]*\\>(.*)\\<\\/body/m <BODY> ^ <BODY> /\\<body[^>]*\\>(.*)\\<\\/body/m <BODY> .*) /\\<body[^>]*\\>(.*)\\<\\/body/m <BODY> /\\<body[^>]*\\>(.*)\\<\\/body/m应该返回<BODY>元素内的所有内容。

$val = getData('myWebsite.html');
var reg = /\<body[^>]*\>([^]*)\<\/body/m;
div.innerHTML = $val.match( reg )[1];

Example jsFiddle code: http://jsfiddle.net/x4hPZ/1/ jsFiddle示例代码： http ： //jsfiddle.net/x4hPZ/1/

Answer 3

how about: 怎么样：

var bodyContents = htmlstring.split('<body');//no >, body could have a property
bodyContents = bodyContents[1].replace('</body>','').replace('</html>','').replace(/^.*\>/,'');

The last regex replace removes the closing > of the opening body tag, and all possible tag properties. 最后一个正则表达式替换将删除开头body标签的close >和所有可能的标签属性。

This is, however, not the way I would do things... If at all possible, I'd create an (i)Frame node, load the html into that frame, and get the innerHTML from the body tag. 但是，这不是我要做的事情...如果可能的话，我将创建一个（i）Frame节点，将html加载到该框架中，然后从body标记中获取innerHTML。 Just a suggestion. 只是一个建议。

Right, the iFrame way: 正确，iFrame方式：

var document.ifrm = document.createElement('iframe')
document.ifrm.style = 'visibility:hidden';
document.body.appendChild(document.ifrm);
idoc = (document.ifrm.contentDocument ? document.ifrm.contentDocument : document.ifrm.contentWindow.document;)
idoc.open();
idoc.writeln('<html><head><title>foobar</title></head><body><p>Content</p></body></html>');
idoc.close();
var bodyContents = idoc.body.innerHTML;

For code explanation: http://softwareas.com/injecting-html-into-an-iframe 有关代码说明： http : //softwareas.com/injecting-html-into-an-iframe

or any other hit on google.com for that matter :) 或与此相关的google.com上的其他任何点击：）

Answer 4

With jQuery you could do it like this: 使用jQuery，您可以这样做：

$(document).ready(function(){
    var your_content = $("html").clone().find("head,body").remove().end().html();
});

get the content with "html" selector 使用“ html”选择器获取内容
make a copy with clone 用clone制作副本
find the tags you want to remove find您要删除的标签
remove them and 删除它们并
convert back to HTML 转换回HTML

all in one line. 一站式

HTH, HTH，

--hennson --hennson

如何使用JavaScript使用HTML从字符串中删除整个HTML，HEAD标签和BODY标签？

问题描述

4 个解决方案

解决方案1
3 2012-03-16 12:29:50

解决方案2
1 已采纳 2012-03-16 12:31:00

解决方案3
0 2012-03-16 12:32:53

解决方案4
0 2012-03-16 12:37:08

如何使用JavaScript使用HTML从字符串中删除整个HTML，HEAD标签和BODY标签？

问题描述

4 个解决方案

解决方案1 3 2012-03-16 12:29:50

解决方案2 1 已采纳 2012-03-16 12:31:00

解决方案3 0 2012-03-16 12:32:53

解决方案4 0 2012-03-16 12:37:08

解决方案1
3 2012-03-16 12:29:50

解决方案2
1 已采纳 2012-03-16 12:31:00

解决方案3
0 2012-03-16 12:32:53

解决方案4
0 2012-03-16 12:37:08