简体   繁体   中英

how can I configure mongoDB for nutch?

Recently I try to finish a web-spider, I use nutch-1.10, I want to load data into mongoDB, which data gotten by nutch/crawl, I don't how to configure mongoDB for nutch, I can't find relative materials. I know that from the some blogs that nutch2.x is must while 1.x can not achieve my purpose! But the details for configuring still unclear to me! Can someone clear that!Thank you!

Nutch 2.x support for MongoDB is not for storing extracted and structured result but to store nutch's internal database in MongoDB.

Currently, nutch supports pushing data to Apache Solr, Elasticsearch and Amazon Cloud service. If you want to push the data to MongoDB, then you need to create a new indexer plugin. Look at indexer-elastic or indexer-solr to understand how to write a new indexer plugin.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM