简体   繁体   中英

JavaScript Regexp between two strings without capturing first string

So I've gone through several examples and this seems like it should be very simple, however nothing seems to be working.

I'm pulling an html file from an email and trying to parse it out using REGEXP. The line that I'm working on currently is this:

<br><br><b>STATUS:</b> Cancel<br><br><b>

And throughout the whole document there are a bunch of those tags.

I'm using regexr.com to do my testing.

The closest expression I've been able to come up with is:

(?:STATUS:<\/b> )(.*?)(?=&lt)

When I use that though, it returns:

STATUS:</b> Cancel

I'm just trying to get the "Cancel". I've seen other questions answered with using lookbehind, but that's not supported in JavaScript. Is there a work around for this or am I approaching this the wrong way?

Edit

I'm trying to pull the information via a Google Web App.

What I've learned so far is it depends on how your browser displays the info if you see "<" or the actual "<" so to make it easier to see, I shifted my characters in the REGEXP to:

(?:<b>STATUS:<\/b>)(.*?)(?=<br>)

The line I'm trying to interpret would be:

<b>STATUS:</b> Cancel<br>

Here's the code I'm using to run the REGEXP:

var re = new RegExp('(?:<b>STATUS:<\/b> )(.*?)(?=<br>)');
var status = messages[i].getBody().match(re)[1];
var child = XmlService.createElement('Status').setText(status);
root.addContent(child);

When I try to run it, I get the same thing

match[0] = "<b>STATUS:</b> Cancel"
match[1] = "<b>STATUS:</b> Cancel"

Your regex seems to work, just extract match[1] :

 let str = "&lt;br&gt;&lt;br&gt;&lt;b&gt;STATUS:&lt;/b&gt; Cancel&lt;br&gt;&lt;br&gt;&lt;b&gt;" console.log( str.match(/(?:STATUS:&lt;\\/b&gt; )(.*?)(?=&lt)/)[1]) // "Cancel" 

Alright, I think I got it. And apologies for any confusion.

I'm not sure if this is a bug somewhere, but the REGEXP works if you add parenthesis after the look ahead. The actual word will return in the third spot or match[2]

/(?:(<b>STATUS:<\/b> ))(.*?)(?=<br>)/

This is working for me:

var re = new RegExp('(?:(<b>STATUS:<\/b> ))(.*?)(?=<br>)');
var status = messages[i].getBody().match(re)[2];
var child = XmlService.createElement('Status').setText(status);
root.addContent(child);

It's an answer, however it doesn't make much sense to me. If someone can explain what's being pulled it would be much appreciated.

match[0] = "<b>STATUS:</b> Cancel"
match[1] = "<b>STATUS:</b> "
match[2] = "Cancel"
match[3] = null

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM