简体   繁体   English

如何获取网址摘要?

[英]How to get snippet for an URL?

I have an URL, how can I get a description for this website (like snippets returned by Google) in Java. 我有一个网址,如何用Java获取该网站的描述(如Google返回的代码段)。 Is this possible with the Google API or Bing API? Google API或Bing API是否可以?

HttpClient gives metadata, but we can't get the description of the website. HttpClient提供元数据,但我们无法获得网站的描述。

Usually that information is stored in a special meta tag in the <head> . 通常,该信息存储在<head>中的特殊meta标记中。

<meta name="description" content="...Here goes the description your after...">

So what you want to do is to parse the content of your URL looking for that meta data tag. 因此,您要做的是解析URL的内容以查找该元数据标签。 (So no need to use the API.) (因此无需使用API​​。)

And an example of how to parse and download the page can be found here Parse Web Site HTML with JAVA . 有关如何解析和下载页面的示例,请参见使用JAVA解析网站HTML

But if you prefer to use the Bing API it will return the description as well in the xml or json payload, according to http://www.bing.com/developers/s/apibasics.html . 但是,根据http://www.bing.com/developers/s/apibasics.html ,如果您更喜欢使用Bing API,它将在xml或json有效负载中返回描述。

Or with the Google API by using the custom search API and setting the c2coff property to 0, for more information of the API https://developers.google.com/custom-search/docs/xml_results . 或通过使用自定义搜索API并将c2coff属性设置为0来使用Google API,以获取有关API https://developers.google.com/custom-search/docs/xml_results的更多信息。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM