繁体   English   中英

如何在C#中从Google HTML搜索结果中获取链接?

[英]How to get links from google html search results in c#?

我得到了这段代码,它以HTML字符串的形式将Google的搜索结果带给了我:

 WebClient webClient = new WebClient();
 string htmlString = webClient.DownloadString("http://www.google.com/search?q=" + searchQuery);

知道如何仅从中提取链接吗? 我想我做了一个字符串搜索,但是看起来并不那么优雅...

我找到了这段代码

HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();
htmlDoc.LoadHtml(htmlString);
var selectNodes = htmlDoc.DocumentNode.SelectNodes("//li[@class='g']");
foreach (var node in selectNodes)
{
     //node.InnerText will give you the text content of the li tags ...
}

但是我遇到了一个异常,即var selectNodes = htmlDoc.DocumentNode.SelectNodes("//li[@class='g']"); 一片空白...

HtmlDocument doc = new HtmlDocument();
        doc.Load("file.htm");
        HtmlNodeCollection links = doc.DocumentNode.SelectNodes("//*[@background or @lowsrc or @src or @href]");
        foreach (HtmlNode link in links)
        {

            if (link.Attributes["background"] != null)
                link.Attributes["background"].Value = _newPath + link.Attributes["background"].Value;
            if (link.Attributes["href"] != null)
                link.Attributes["href"].Value = _newPath + link.Attributes["href"].Value;(link.Attributes["href"] != null)
                link.Attributes["lowsrc"].Value = _newPath + link.Attributes["href"].Value;
            if (link.Attributes["src"] != null)
                link.Attributes["src"].Value = _newPath + link.Attributes["src"].Value;
        }

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM