简体   繁体   中英

Replacing HTML String & Avoiding Tags (regex)

I'm trying to use JS to replace a specific string within a string that contains html tags+attributes and styles while avoiding the inner side of the tags to be read or matched (and keep the original tags in the text).

for example, I want <span> this is span text </span> to be become: <span> this is s<span class="found">pan</span> text </span> when the keyword is "pan"

I tried using regex with that .. My regex so far:

$(this).html($(this).html().replace(new RegExp("([^<\"][a-zA-Z0-9\"'\=;:]*)(" + search + ")([a-zA-Z0-9\"'\=;:]*[^>\"])", 'ig'), "$1<span class='found'>$2</span>$3"));

This regex only fails in cases like <span class="myclass"> span text </span> when the search="p", the result:

<s<span class="found">p</span>an class="myclass"> s<span class="found">p</span>an text</s<span class="found">p</span>an>

*this topic should help anyone who seeks to find a match and replace the matched string while avoiding strings surrounded by specific characters to be replaced.

Do not use regexes with html, traverse and manipulate the DOM instead:

doc = $('<div><span class="myclass"> span text </span></div>')
$(doc).find("*").andSelf().contents().each(function() {
    if(this.nodeType == 3)
        $(this).replaceWith($(this).text().replace(/p/g, "<b>p</b>"))
})
console.log(doc.html())
// <span class="myclass"> s<b>p</b>an text </span>

If you insist on using regexes, it goes like this:

text = '<span class="myclass"> <p>span</p> </span>'
found = 'p'
re = new RegExp(found + '(?=[^<>]*(<|$))', 'g')
text = text.replace(re, "<b>$&</b>")
console.log(text)
// <span class="myclass"> <p>s<b>p</b>an</p> </span>

As thg435 say, the good way to deal with html content is to use the DOM.

But if you want to avoid something in a replace, you can match that you want to avoid first and replace it by itself.

Example to avoid html tags:

var text = '<span class="myclass"> span text </span>';

function callback(p1, p2) {
    return ((p2==undefined)||p2=='')?p1:'<span class="found">'+p1+'</span>';
}

var result = text.replace(/<[^>]+>|(p)/g, callback);

alert(result);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM