简体   繁体   中英

Find text not within HTML tags with Javascript (regex)

I have a string from a DOM element, which contains something similar to the following:

<span class='greenhornet'>Can you catch the green?</span>

I need to know the position of the word green .

In this case, if I setup a pattern /green/ , JS exec() of course will return the first occurrence of green (position 13).

Is there a way to tell JS regexp to ignore ! the word green , if it's between < and > or is there an easier way to do this?

Oh, and I can't just strip the HTML either!

thanks.

As the commentors (and user1883592) have suggested, stripping the HTML or parsing the text out of the HTML is the correct answer here. Using regular expressions with HTML is a loser's game; you've been warned.

But, that being said, if you really want to play that game, I'd start by ensuring there are no opening brackets in between your term and the last closing bracket; in other words:

var greenRegex = />[^<]+(green)/;
var position = "<span class='greenhornet'>Can you catch the green?</span>".search(greenRegex);
// position = 25, not 13

You can get innerHTML of the span element. No Regex needed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM