简体   繁体   中英

regex for extracting HTML tag child elements?

I have following code in HTML string.

<h3 class="large lheight20 margintop10">
<a href="https://google.com" class="marginright5 link linkWithHash detailsLink">
<span>get the content</span>
</a>

</h3><h3 class="large lheight20 margintop10">
<a href="https://google.com" class="marginright5 link linkWithHash detailsLink">
<span>get the content</span>
</a>

</h3>

and i want to extract the following tag:

    <a href="https://google.com" class="marginright5 link linkWithHash detailsLink">
    <span>get the content</span>
    </a>
<a href="https://google.com" class="marginright5 link linkWithHash detailsLink">
<span>get the content</span>
</a>

I have written following regex :

<h3[^>]+?>(.*)<\/h3>

But it is returning wrong results :

<a href="https://google.com" class="marginright5 link linkWithHash detailsLink">
<span>get the content</span>
</a>

</h3><h3 class="large lheight20 margintop10">
<a href="https://google.com" class="marginright5 link linkWithHash detailsLink">
<span>get the content</span>
</a>

Please help me to extract the tags.

Use this regex:

<h3[^>]+?>([^$]+?)<\/h3>

Example here:

https://regex101.com/r/pQ5nE0/2

You could try:

 function getA(str) { var regex = /<a\\s+[\\s\\S]+?<\\/a>/g; while (found = regex.exec(str)) { document.write(found[0] + '<br>'); } } var str = '<h3 class="large lheight20 margintop10">\\n' + '<a href="https://google.com" class="marginright5 link linkWithHash detailsLink">\\n' + '<span>get the content</span>\\n' + '</a>\\n' + '\\n' + '</h3><h3 class="large lheight20 margintop10">\\n' + '<a href="https://google.com" class="marginright5 link linkWithHash detailsLink">\\n' + '<span>get the content</span>\\n' + '</a>\\n' + '\\n' + '</h3>'; getA(str); 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM