简体   繁体   English

如何确保C#WebBrowser控件中搜索的文本是实际文本,而不是元素或属性?

[英]How to make sure searched text in a C# WebBrowser control is actual text and not an element or attributes?

I am going to leave this here in case anyone can still answer this, but I am going to go a different route for my search 如果有人仍然可以回答这个问题,我将在此处保留,但我将走另一条路线进行搜索

I know there are several questions on here that are similar but none get me where I am going. 我知道这里有几个类似的问题,但是没有一个让我明白我要去的地方。

I have the search part basically finished. 我的搜索部分基本完成。 It works beautifully. 它工作得很漂亮。 Gets all occurrences of the searched word or phrase ignoring case. 获取所有出现的搜索词或短语,忽略大小写。 But the problem is, if you were to search for "div" or "table" or some other word that is an html element name or attribute value, the search tries to highlight that too and totally screws up the page. 但是问题是,如果您要搜索“ div”或“表”或某个其他单词(即html元素名称或属性值),则搜索会试图突出显示该单词,从而完全破坏了页面。

So I really just need a simple way to make sure the search ignores those occurrences. 因此,我真的只需要一种简单的方法来确保搜索忽略这些情况。 Here is what I have. 这就是我所拥有的。 I assume I probably need a really good regex but I can't write a regex to save my life, so help would be appreciated. 我认为我可能需要一个非常好的正则表达式,但是我不能编写一个正则表达式来挽救生命,因此我们将不胜感激。

private void PerformSearch()
{
  string searchString = SearchTextBox.Text;
  HtmlDocument doc = ManualViewBrowser.Document;
  StringBuilder html = new StringBuilder(doc.Body.InnerHtml);

  doc.Body.InnerHtml = Regex.Replace(html.ToString(), searchString, new MatchEvaluator(Highlight), RegexOptions.IgnoreCase);
}

private string Highlight(Match m)
{
  return "<em class=\"highlight\">" + m.Value + "</em>";
}

Just remove all html tags from that html string with this method: 只需使用以下方法从该html字符串中删除所有html标签:

private string RemoveHtmlTags(string html) {
  return Regex.Replace(html, "<.*?>", String.Empty);
}

edit: 编辑:

you are right, so instead of search inside the html just loop trough all the nodes of the page and search for the word inside them. 您是对的,因此与其在html中进行搜索,还不如遍历页面的所有节点并在其中搜索单词。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM