简体   繁体   English

如何从html google jsoup检索图像的最佳猜测

[英]how to retrieve best guess for image from html google jsoup

We are trying to retrieve the best guess for an image given the html of the search results page returned by Google. 给定Google返回的搜索结果页的html,我们正在尝试为图像检索最佳猜测。 We know that the best guess for the image has the class qb-b so we tried selecting elements with 'a' tag using the .select method. 我们知道图像的最佳猜测是qb-b类,因此我们尝试使用.select方法选择带有'a'标签的元素。 Yet when we printed the document retrieved using the get method of jsoup, the document did not contain any "best guess" string. 但是,当我们打印使用jsoup的get方法检索的文档时,该文档不包含任何“最佳猜测”字符串。

The code we wrote is below. 我们编写的代码如下。 How can we fix it? 我们该如何解决?

String newUrl = connect1.getHeaderField("Location");

Document doc = Jsoup.connect(newUrl.toString()).get();            
Elements bestguess = doc.select("a.qb-b");

System.out.println(bestguess.toString());

You have to set User-Agent header. 您必须设置User-Agent标头。 Google will redirect you to main page instead. Google会将您重定向到主页。 Try: 尝试:

String newUrl = connect1.getHeaderField("Location");

Document doc = Jsoup.connect(newUrl.toString()).
                             userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.76 Safari/537.36").
                             get();            
Elements bestguess = doc.select("a.qb-b");

System.out.println(bestguess.toString());

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM