A little help required please...
I have a regular expression that matches characters at the start of a string as follows:
If I have a set of strings like so:
Ray Fox Foster Joe Finding Forrester
REGEX
/\bfo[^\b]*?\b/gi
This will match 'FO' in Fox, Foster, and Forrester as expected:
However, I am faced with an issue where if the set of strings are wrapped in html tags like so;-
<span class="fontColor1">Ray Fox</span>
<span class="fontColor2">Foster Joe</span>
<span class="fontColor3">Finding Forrester</span>
This will match 'FO' in fontColor* as well.
I'm fairly green with Regular expressions, I need a little help updating the query so that it only searches values between HTML tags where HTML tags exist, but still works correctly if HTML tags do not exist.
What about
<.*?span.*?>(.*?)<\s?\/.*?span.*?>
And where do you have text where html tags don't exist? That makes no sense.
EDIT:
This solution will not match nested tags, but as the question is written, that doesn't seem to be an issue.
You can use a html parser and extract pure text, and match that.
var root;
try {
root = document.implementation.createHTMLDocument("").body;
}
catch(e) {
root = document.createElement("body");
}
root.innerHTML = '<span class="fontColor1">Ray Fox</span>\
<span class="fontColor2">Foster Joe</span>\
<span class="fontColor3">Finding Forrester</span>';
//If you are using jQuery
var text = $(root).text();
//Proceed as normal with the text variable
If you are not using jQuery, you can replace $(root).text()
with findText(root)
, where findText
:
function findText(root) {
var ret = "",
nodes = root.childNodes;
for (var i = 0; i < nodes.length; ++i) {
if (nodes[i].nodeType === 3) {
ret += nodes[i].nodeValue;
} else if (nodes[i].nodeType === 1) {
ret += findText(nodes[i]);
}
}
return ret;
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.