[英]highlight words in html using regex in C#
I found this article on stackoverflow 我在stackoverflow上找到了这篇文章
highlight words in html using regex & javascript - almost there 使用正则表达式和JavaScript突出显示html中的单词-几乎存在
Using the article above, I am trying to highlight HTML text on the server using c#. 通过上面的文章,我试图使用c#在服务器上突出显示HTML文本。 The code is shown below:
代码如下所示:
string replacePattern = "$1<span style=\"background-color:yellow\">$2</span>";
string searchPattern = String.Format("(?<=^|>)(.*?)({0})(?=.*?<|$)", searchString.Trim());
content = Regex.Replace(content, searchPattern, replacePattern, RegexOptions.IgnoreCase);
The code seems to work great except when trying to highlight a word that is contained in an image source: 除了试图突出显示图像源中包含的单词时,该代码似乎运行良好:
Search Keyword: 搜索关键字:
ABC
Search Text: 搜索文字:
<div><img src="/site/folder/ABC.PNG" /><br />ABC</div>
The result will highlight both the text and the image name. 结果将突出显示文本和图像名称。
Any help would be greatly appreciated. 任何帮助将不胜感激。
I'll offer up a solution, but I agree that solely using Regex for parsing HTML can eventually not be worth the effort. 我将提供一个解决方案,但是我同意仅使用Regex来解析HTML最终是不值得的。 That said, you know more about your problem space than the rest of us, so if the HTML you're highlighting is under your control you may be able to test enough of your domain to achieve what you want with regexes.
就是说,与我们其他人相比,您对问题空间的了解更多,因此,如果要突出显示的HTML在您的控制之下,则您可能能够测试您的域中的足够多的内容,以使用正则表达式来实现所需的功能。
My solution changes the regex you've supplied to take this approach: 我的解决方案更改了您提供的正则表达式以采用这种方法:
Caveats: 注意事项:
<textarea>...</textarea>
<textarea>...</textarea>
<script>...</script>
<script>...</script>
Note you could expand the capture on the lefthand side to capture the tag name and conditionally not replace for a set of tags like textarea and script. 请注意,您可以在左侧扩展捕获以捕获标签名称,并且有条件地不替换诸如textarea和script的一组标签。
string searchString = "ABC";
string content = "<div><img src='/site/folder/ABC.PNG' /><br />ABC</div>";
string replacePattern = "$1<span style=\"background-color:yellow\">$2</span>$3";
string searchPattern = String.Format("(>[^<>]*?)({0})([^<>]*?<)", searchString.Trim());
content = Regex.Replace(content, searchPattern, replacePattern, RegexOptions.IgnoreCase);
Console.WriteLine(content);
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.