简体   繁体   English

如何将 html 文档末尾的一堆 p 标签与 javascript 正则表达式匹配?

[英]How can I match a bunch of p tags at the end of an html document with javascript regex?

Here is a sample content:这是一个示例内容:

<p> so so so </p>
<div> whatever</div
<p> another paragraph </p>
<div> forever </div>
<p> first of last </p>
<p> second of last </p>

How can I match the last two paragraphs (or any number of consecutive paragraphs) at the end of the above document?如何匹配上述文档末尾的最后两段(或任意数量的连续段落)?

The match output I want is:我想要的匹配 output 是:

<p> first of last </p>
<p> second of last </p>

I tried /(<p>[\s\S]*?<\/p>[\s]*)$/g , but the lazy matching is not working as expected, it sucks all the p tags in between, and matches from the first opening p tag it encounters up to the end of the document.我试过/(<p>[\s\S]*?<\/p>[\s]*)$/g ,但是延迟匹配没有按预期工作,它吸收了所有 p 之间的标签,并且从它遇到的第一个开始 p 标记匹配到文档的末尾。

Note: there might not be paragraphs at the end at all, the regex should not match if there are no paragraphs at the end.注意:末尾可能根本没有段落,如果末尾没有段落,则正则表达式不应该匹配。

Here we use regex to match all paragraphs and then take the last two elements of the result array.这里我们使用正则表达式匹配所有段落,然后取结果数组的最后两个元素。

 let str = `<p> so so so </p> <div> whatever </div <p> another paragraph </p> <div> forever </div> <p> first of last </p> <p> second of last </p>` let reg = /<p>[\w\s]*<\/p>/g; let res = str.match(reg); console.log(res[res.length-2]); console.log(res[res.length-1]);

Adding a negative look ahead to make sure nested paragraphs are not matched seems to do the trick:添加负面展望以确保嵌套段落不匹配似乎可以解决问题:

/(<p>((??<p>)[\s\S])*?<\/p>[\s]*)+$/g

Would appreciate better suggestions though!不过会感谢更好的建议!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM