（Web）使用C＃中的正则表达式从某些div抓取内容

Question

WebClient web = new WebClient();
String website = web.DownloadString("https://www.google.com");

String search = @"";
MatchCollection matches = Regex.Matches(Website,Search);

foreach (Match m in matches){}

This is what I use to scrape a website (I don't know if this is the best way, if not I'm interested in learning other ways) 这就是我用来抓取网站的方式（我不知道这是否是最好的方式，如果不是，我有兴趣学习其他方式）

My problem is the search string using regex I can by example find ever word that follows title= . 我的问题是使用正则表达式的search string我可以通过示例找到在title=单词。 But I only want to extract it when it's in a certain div and I don't know if I can do it this way. 但是我只想在某个div中将其提取，并且我不知道我是否可以这样做。

Thanks 谢谢

Answer 1

是的，正如Wiktor提到的那样，请尝试对HTMl和静态页面使用HtmlAgilityPack，或者使用某些浏览器自动化功能-Selenium Chrome或无头PhantomJS-以防万一，如果您托管网站，那么其中很多Java代码和内容都是动态生成的。

（Web）使用C＃中的正则表达式从某些div抓取内容

问题描述

1 个解决方案

解决方案1
1 2017-06-30 21:13:42

（Web）使用C＃中的正则表达式从某些div抓取内容

问题描述

1 个解决方案

解决方案1 1 2017-06-30 21:13:42

解决方案1
1 2017-06-30 21:13:42