简体   繁体   中英

Javascript regex to replace <ul> or <ol> with only text within <li> tag

So let's say I have an ordered or unordered list that is in string form. I need only to return the text that is contained within each list item of the respective parent element:

<ul>
    <li>Example One</li>
    <li>Example Two</li>
</ul>

I have made this work, but obviously not very efficient:

var first = string.replace(/.*<li>(.*)<\/li>.*/g, '$1');
var second = first.replace(/(<ul>|<ol>|<\/ol>|<\/ul>)/g, '');

Output is what I expect, but I know there is a regex format that will accomplish at once, but I am still pretty green in regards to regex so not sure what I am doing wrong. This is what I thought would work:

var newString = string.replace(/(<ul>|<ol>).*<li>(.*)<\/li>.*(<\/ul>|\/<ol>)/g, '$2');

However, this returns the entire HTML structure as a string:

<ul>
    <li>Example One</li>
    <li>Example Two</li>
</ul>

As always my friends, thank you in advance.

You could split the regex in the pattern /((<ul>|<ol>)|<li>(.*)<\/li>|(<\/ul>|<\/ol>))/g , example below you were pretty close.

 const template = `<ul> <li>Example One</li> <li>Example Two</li> </ul>` const str = template.replace(/((<ul>|<ol>)|<li>(.*)<\/li>|(<\/ul>|<\/ol>))/g, '$3'); console.log(str);

EDIT explanation:

The total pattern ((<ul>|<ol>)|<li>(.*)<\/li>|(<\/ul>|<\/ol>)) minus modifiers becomes the 1st capturing group (will hold anything that is passed up from subgroup).

We are giving 3 alternatives in 1st capturing group (<ul>|<ol>) <li>(.*)<\/li> <\/ul>|<\/ol>

For alternative 1 it must match exactly <ul> or <ol> and it is also capturing group 2.

For alternative 2 it must match <li>(.*)<\/li> of <li> ANYTHING </li> , the .* becomes capture group 3 because it is enclosed in parentheses () .

For alternative 3 it must match exactly </ul> or </ol> and it is also capturing group 4.

So run down:

<ul> matches groups (0: everything, 1: parent, and 2: the <ul>|<ol> ) aka [ <ul> , <ul> , <ul> ]

<li>Example One</li> matches groups (0: everything, 1: parent, 2: no match, and 3: <li>(.*)<\/li> ) aka [ <li>Example One</li> , <li>Example One</li> , null, Example One ]

Homework on the last part:p

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM