繁体   English   中英

正则表达式提取HTML标签子元素?

[英]regex for extracting HTML tag child elements?

我在HTML字符串中有以下代码。

<h3 class="large lheight20 margintop10">
<a href="https://google.com" class="marginright5 link linkWithHash detailsLink">
<span>get the content</span>
</a>

</h3><h3 class="large lheight20 margintop10">
<a href="https://google.com" class="marginright5 link linkWithHash detailsLink">
<span>get the content</span>
</a>

</h3>

我想提取以下标签:

    <a href="https://google.com" class="marginright5 link linkWithHash detailsLink">
    <span>get the content</span>
    </a>
<a href="https://google.com" class="marginright5 link linkWithHash detailsLink">
<span>get the content</span>
</a>

我写了以下正则表达式:

<h3[^>]+?>(.*)<\/h3>

但是它返回错误的结果:

<a href="https://google.com" class="marginright5 link linkWithHash detailsLink">
<span>get the content</span>
</a>

</h3><h3 class="large lheight20 margintop10">
<a href="https://google.com" class="marginright5 link linkWithHash detailsLink">
<span>get the content</span>
</a>

请帮助我提取标签。

使用此正则表达式:

<h3[^>]+?>([^$]+?)<\/h3>

这里的例子:

https://regex101.com/r/pQ5nE0/2

您可以尝试:

 function getA(str) { var regex = /<a\\s+[\\s\\S]+?<\\/a>/g; while (found = regex.exec(str)) { document.write(found[0] + '<br>'); } } var str = '<h3 class="large lheight20 margintop10">\\n' + '<a href="https://google.com" class="marginright5 link linkWithHash detailsLink">\\n' + '<span>get the content</span>\\n' + '</a>\\n' + '\\n' + '</h3><h3 class="large lheight20 margintop10">\\n' + '<a href="https://google.com" class="marginright5 link linkWithHash detailsLink">\\n' + '<span>get the content</span>\\n' + '</a>\\n' + '\\n' + '</h3>'; getA(str); 

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM