簡體   English   中英

正則表達式提取HTML標簽子元素?

[英]regex for extracting HTML tag child elements?

我在HTML字符串中有以下代碼。

<h3 class="large lheight20 margintop10">
<a href="https://google.com" class="marginright5 link linkWithHash detailsLink">
<span>get the content</span>
</a>

</h3><h3 class="large lheight20 margintop10">
<a href="https://google.com" class="marginright5 link linkWithHash detailsLink">
<span>get the content</span>
</a>

</h3>

我想提取以下標簽:

    <a href="https://google.com" class="marginright5 link linkWithHash detailsLink">
    <span>get the content</span>
    </a>
<a href="https://google.com" class="marginright5 link linkWithHash detailsLink">
<span>get the content</span>
</a>

我寫了以下正則表達式:

<h3[^>]+?>(.*)<\/h3>

但是它返回錯誤的結果:

<a href="https://google.com" class="marginright5 link linkWithHash detailsLink">
<span>get the content</span>
</a>

</h3><h3 class="large lheight20 margintop10">
<a href="https://google.com" class="marginright5 link linkWithHash detailsLink">
<span>get the content</span>
</a>

請幫助我提取標簽。

使用此正則表達式:

<h3[^>]+?>([^$]+?)<\/h3>

這里的例子:

https://regex101.com/r/pQ5nE0/2

您可以嘗試:

 function getA(str) { var regex = /<a\\s+[\\s\\S]+?<\\/a>/g; while (found = regex.exec(str)) { document.write(found[0] + '<br>'); } } var str = '<h3 class="large lheight20 margintop10">\\n' + '<a href="https://google.com" class="marginright5 link linkWithHash detailsLink">\\n' + '<span>get the content</span>\\n' + '</a>\\n' + '\\n' + '</h3><h3 class="large lheight20 margintop10">\\n' + '<a href="https://google.com" class="marginright5 link linkWithHash detailsLink">\\n' + '<span>get the content</span>\\n' + '</a>\\n' + '\\n' + '</h3>'; getA(str); 

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM