简体   繁体   English

如何为mongoDB配置坚果?

[英]how can I configure mongoDB for nutch?

Recently I try to finish a web-spider, I use nutch-1.10, I want to load data into mongoDB, which data gotten by nutch/crawl, I don't how to configure mongoDB for nutch, I can't find relative materials. 最近,我尝试完成一个Web蜘蛛,我使用nutch-1.10,我想将数据加载到mongoDB中,这些数据是通过nutch / crawl获得的,我没有如何配置mongoDB的nutch,我找不到相关的资料。 I know that from the some blogs that nutch2.x is must while 1.x can not achieve my purpose! 我从一些博客中知道,nutch2.x是必须的,而1.x无法实现我的目的! But the details for configuring still unclear to me! 但是配置细节仍然不清楚。 Can someone clear that!Thank you! 有人可以清除吗,谢谢!

Nutch 2.x support for MongoDB is not for storing extracted and structured result but to store nutch's internal database in MongoDB. Nutch 2.x对MongoDB的支持不是用于存储提取的和结构化的结果,而是将nutch的内部数据库存储在MongoDB中。

Currently, nutch supports pushing data to Apache Solr, Elasticsearch and Amazon Cloud service. 当前,nutch支持将数据推送到Apache Solr,Elasticsearch和Amazon Cloud服务。 If you want to push the data to MongoDB, then you need to create a new indexer plugin. 如果要将数据推送到MongoDB,则需要创建一个新的索引器插件。 Look at indexer-elastic or indexer-solr to understand how to write a new indexer plugin. 查看indexer-elasticindexer-solr以了解如何编写新的indexer插件。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM