[英]Regex replace text outside html tags
I have this HTML:我有这个 HTML:
"This is simple html text <span class='simple'>simple simple text text</span> text"
I need to match only words that are outside any HTML tag.我只需要匹配任何 HTML 标签之外的单词。 I mean if I want to match “simple” and “text” I should get the results only from “This is simple html text” and the last part “text”—the result will be “simple” 1 match, “text” 2 matches.
我的意思是如果我想匹配“simple”和“text”,我应该只从“This is simple html text”和最后一部分“text”中得到结果——结果将是“simple” 1 match, “text” 2火柴。 Could anyone help me with this?
有人可以帮我解决这个问题吗? I'm using jQuery.
我正在使用 jQuery。
var pattern = new RegExp("(\\b" + value + "\\b)", 'gi');
if (pattern.test(text)) {
text = text.replace(pattern, "<span class='notranslate'>$1</span>");
}
value
is the word I want to match (in this case “simple”) value
是我想要匹配的单词(在这种情况下是“简单”)text
is "This is simple html text <span class='simple'>simple simple text text</span> text"
text
是"This is simple html text <span class='simple'>simple simple text text</span> text"
I need to wrap all selected words (in this example it is “simple”) with <span>
.我需要用
<span>
包裹所有选定的单词(在这个例子中它是“简单的”)。 But I want to wrap only words that are outside any HTML tags.但我只想包装任何HTML 标签之外的单词。 The result of this example should be
这个例子的结果应该是
This is <span class='notranslate'>simple</span> html <span class='notranslate'>text</span> <span class='simple'>simple simple text text</span> <span class='notranslate'>text</span>
I do not want replace any text inside我不想替换里面的任何文字
<span class='simple'>simple simple text text</span>
It should be the same as before replacement.应该和更换前一样。
Okay, try using this regex:好的,尝试使用这个正则表达式:
(text|simple)(?![^<]*>|[^<>]*</)
Example worked on regex101 .示例适用于 regex101 。
Breakdown:分解:
( # Open capture group
text # Match 'text'
| # Or
simple # Match 'simple'
) # End capture group
(?! # Negative lookahead start (will cause match to fail if contents match)
[^<]* # Any number of non-'<' characters
> # A > character
| # Or
[^<>]* # Any number of non-'<' and non-'>' characters
</ # The characters < and /
) # End negative lookahead.
The negative lookahead will prevent a match if text
or simple
is between html tags.如果
text
或simple
位于 html 标签之间,则负向前瞻将阻止匹配。
^([^<]*)<\w+.*/\w+>([^<]*)$
However this is a very naive expression.然而,这是一个非常幼稚的表达。 It would be better to use a DOM parser.
最好使用 DOM 解析器。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.