简体   繁体   中英

Get all P Tags in the UL tags, Javascript Regex

I am racking my brain trying to figure out a regex for this. I have the following invalid html:

...some html tags above...

<p>Bullet points:</p>
<ul>
    <li/>
<p>point 1</p>
    <li/>
<p>point 2</p>
</ul>

<p>Other Bullet points:</p>
<ul>
    <li/>
<p>point 3</p>
    <li/>
<p>point 4</p>
</ul>

...some html tags below...

I'm trying to get all the data between the <p></p> tags that are within <ul></ul> tags and replace them with valid li tags. Ie I plan to replace the above with the below:

...some html tags above...

<p>Bullet points:</p>
<ul>
    <li>point 1</li>
    <li>point 2</li>
</ul>

<p>Other Bullet points:</p>
<ul>
    <li>point 3</li>
    <li>point 4</li>
</ul>

...some html tags below...

You should do 2 RegeXp for that, first to get the inner HTML of the UL tags, and then replace the P tags with LI tags.

First get all UL tags:

var UL_tags=/<ul>([\s\S]*?)<\/ul>/g
// [\s\S] Mean any char including new lines.

Now, all you have to do:

new_html=myHtml.replace(UL_tags,function(r0,innerHTML){
    return innerHTML.replace(/<p>/g,'<ul>').replace(/<\/p>/g,'</ul>')
})

Be aware that it is not working for nested UL tags (UL inside UL)

UPDATE: Now, you need to support attributes inside the UL, for example: <ul class...> so we need to ignore the tag attributes, so the Regexp need to be little more complicated (sorry):

 var UL_tags=/<ul[^>]*?>([\s\S]*?)<\/ul>/g
 // [^>] Mean any char except closing tag.

Try this in jQuery:

$('p').each(function(index){
    p_str = $(this).text();
    ....
})

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM