简体   繁体   中英

Replacing Google Site Search with AWS Cloudsearch

So I'm working on a site that has pretty specific global site search functionality that utilizes GSS which, as many of you already know, is going away in April. I need to crawl the site and send XML over to Cloudsearch, but I'm kind of confused as to how to go about this and I haven't found much material on building a global site search using AWS Cloudsearch after scouring the internet for a couple of days. So far, I'm planning on crawling the site with Apache Nutch, but I would really appreciate some input.

Did you come across our blog? Index the web with AWS CloudSearch Index the web with StormCrawler (revisited) . I described how to use Nutch and StormCrawler to index to AWS Cloudsearch.

If you need the search to be hosted, I'd recommend Elasticsearch and Elastic Cloud instead. I found Cloudsearch slow, cumbersome and expensive and also there are more resources for Elasticsearch for StormCrawler and Apache Nutch.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM