简体   繁体   English

使用javascript进行字符串提取

[英]String extraction with javascript

I'm using jquery and have an entire html page stored in var page 我正在使用jquery并在var page存储了一个完整的html var page

var page = '<html>...<div id="start">......</div><!-- start -->....</html>';

How can I extract only the section that starts with <div id="start"> all the way to after the end tag </div><!-- start --> such that my output is 如何只提取以<div id="start">的部分一直到结束标记之后</div><!-- start -->这样我的输出是

<div id="start">......</div><!-- start -->
$(page).find('#start').html();

if it's valid html, it would be easiest to just let the browser do it for you. 如果它是有效的HTML,那么让浏览器为你做这件事最简单。 Something like this would do the trick: 像这样的东西可以做到这一点:

var page = '<html><head><title>foo</title><body><div id="stuff"><div id="start">blah<span>fff</span></div></div></body></head></html>';

var start_div = $('#start', page).parent();
alert( start_div.html() )

You can see this example in action at jsFiddle . 你可以在jsFiddle看到这个例子。

[edit] as @Nick pointed out above, this would probably not include the html comment at the end of the div. [编辑]正如@Nick在上面指出的那样,这可能不包括div末尾的html注释。 It also might not work in all browsers -- I don't know -- you should test it. 它也可能不适用于所有浏览器 - 我不知道 - 您应该测试它。 Post back and let us know. 回复并告诉我们。

var start = page.match(/(<div id="start">.*?<!-- start -->)/m)[1];

这应该这样做:

var result = $(page).find('#start')[0].outerHTML;

regex. 正则表达式。 or the lazy way (which I don't recommend but is quick..) would be to create a hidden DIV, throw it in the div and do a selector for it 或懒惰的方式(我不建议,但很快..)将创建一个隐藏的DIV,将其扔在div中并为它做一个选择器

$('#myNewDiv').next('#start').html();

An appropriate regular expression will get you what you are looking for. 适当的正则表达式将为您提供所需的信息。 Try using a line like this: 尝试使用这样的行:

var start = page.match(/(<div id="start">[\s\S]*?<\!-- start -->)/)[1];

This uses JavaScript's match method to return an array of matches from your page string, and puts the first parenthetized sub-match (in this case, your #start tag and the following comment), into start . 这使用JavaScript的匹配方法从页面字符串返回匹配数组,并将第一个括号内的子匹配(在本例中为#start标记和以下注释)放入start

Here's a demo that shows this method working: http://jsfiddle.net/Ender/mphUj/ 这是一个演示此方法的演示: http//jsfiddle.net/Ender/mphUj/

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM