[英]Regular expression to replace match keywords outside html tags AND anchor (a) tag text
I am developing an asp.net application. 我正在开发一个asp.net应用程序。 I want to add a keyword linking system.
我想添加一个关键字链接系统。
I want to make the keyword a hyper-link to another page. 我想让关键字成为另一个页面的超链接。 But, I should not link the keyword if its currently linked (to any page).
但是,我不应该链接关键字,如果它当前链接(到任何页面)。 For example:
例如:
it is a <a href="http://www.somesite.com">linked keyword</a> and it should be a linked keyword.
should convert to: 应转换为:
it is a <a href="http://www.somesite.com">linked keyword</a> and it should be a linked <a href="http://newlycreatedLink.com">keyword</a>.
As you can see, the first keyword should be left intact. 如您所见,第一个关键字应保持不变。
Could you help me please to solve this problem? 你能帮帮我解决这个问题吗?
I've found this link in asp.net forums. 我在asp.net论坛中找到了这个链接 。 But I should tune the answer to exclude currently linked keywords.
但我应该调整答案以排除当前链接的关键字。 I've searched everywhere but found nothing.
我到处搜索但一无所获。
To check if the keywords is "outside", look ahead 要检查关键字是否在“外部”, 请向前看
(?=
if after the keyword there's an opening <tag
or the $
end (?=
如果在关键字之后有一个开头<tag
或$
end [^<>]*
any amount of characters, that are NOT >
OR <
[^<>]*
任何数量的字符,不是>
OR <
(?:<\\w|$)
where \\w
is a shorthand to word-charcters [a-zA-Z_0-9]
(?:<\\w|$)
其中\\w
是word-charcters的缩写[a-zA-Z_0-9]
So the pattern could look like: 所以模式看起来像:
String pattern = @"(?i)\bkeyword\b(?=[^<>]*(?:<\w|$))";
String replacement = @"<a href=\"http://newlycreatedLink.com\">\0</a>";
Put the keyword into word-boundaries \\b
and used (?i)
i modifier for case insensitive. 将关键字放入字边界
\\b
并使用(?i)
i修饰符不区分大小写。
So this would only replace keyword
that is followed by an opening-tag or the end. 所以这只会替换一个开头标记或结尾的
keyword
。
UPDATE : To replace keyword
also "inside" tags, that don't end up with </a
add |<\\/[^a]
: 更新 :要替换
keyword
也是“内部”标签,不会以</a
add |<\\/[^a]
:
String pattern = @"(?i)\bkeyword\b(?=[^<>]*(?:<\w|<\/[^a]|$))";
Don't use regular expressions for sophisticated HTML parsing like this. 不要像这样使用正则表达式进行复杂的HTML解析。 Use a proper HTML parser instead — here's why .
使用正确的HTML解析器 - 这就是原因 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.