简体   繁体   English

在MongoDB和Apache Solr之间同步数据的简便方法

[英]Easy way to Sync Data between MongoDB and Apache Solr

I recently started working with MongoDB and Apache Solr. 我最近开始使用MongoDB和Apache Solr。 I am using MongoDB as a data store and I want Apache Solr to create index for my data for the search feature in my application. 我使用MongoDB作为数据存储,我希望Apache Solr为我的应用程序中的搜索功能创建数据索引。

After some research I found out, there are basically 2 methods to sync the data between MongoDB and Solr. 经过一些研究后我发现,基本上有两种方法可以在MongoDB和Solr之间同步数据。

1) using Solr DataImportHandler - 1)使用Solr DataImportHandler -

For this I used SolrMongoImporter created by james and followed his tutorial on github 为此,我使用了由james创建的SolrMongoImporter,并在github上关注了他的教程

I was able to successfully run the Import Handler and Solr identified the ImportHandler but it was not importing any documents into solr. 我能够成功运行Import Handler并且Solr识别出ImportHandler,但它没有将任何文档导入solr。 Every time it said updated documents=0. 每次它说更新的文件= 0。

2) Then I tried switching to MongoDB side, to look if anything exists there and I found MongoDBConnector provided by 10gen. 2)然后我尝试切换到MongoDB端,看看是否存在任何东西,我发现10gen提供了MongoDBConnector

When I followed the instructions, and ran the connector, it is trying to post lot of documents to Solr and it gives the following output. 当我按照说明操作并运行连接器时,它会尝试将大量文档发布到Solr,并提供以下输出。

2012-11-24 15:15:20,665 - INFO - Finished 'http://localhost:8983/solr/update/?commit=true' (POST) with body '<commit />' in 0.010 seconds.
2012-11-24 15:15:21,674 - INFO - Finished 'http://localhost:8983/solr/update/?commit=true' (POST) with body '<commit />' in 0.009 seconds.
2012-11-24 15:15:22,683 - INFO - Finished 'http://localhost:8983/solr/update/?commit=true' (POST) with body '<commit />' in 0.008 seconds.
2012-11-24 15:15:23,694 - INFO - Finished 'http://localhost:8983/solr/update/?commit=true' (POST) with body '<commit />' in 0.010 seconds.
2012-11-24 15:15:24,702 - INFO - Finished 'http://localhost:8983/solr/update/?commit=true' (POST) with body '<commit />' in 0.008 seconds.
2012-11-24 15:15:25,711 - INFO - Finished 'http://localhost:8983/solr/update/?commit=true' (POST) with body '<commit />' in 0.008 seconds.
2012-11-24 15:15:26,722 - INFO - Finished 'http://localhost:8983/solr/update/?commit=true' (POST) with body '<commit />' in 0.010 seconds.

But no data is there in Solr. 但Solr没有数据。

I wanted to know which approach worked for you guys, and is there any good tutorial on MongoDB and Solr Integration. 我想知道哪种方法对你们有用,有没有关于MongoDB和Solr Integration的好教程。

Also, I am looking for a real-time sync between MongoDB and solr, ie as soon as any product is added to my mongodb, I want it updated in solr index and reflect in search results. 另外,我正在寻找MongoDB和solr之间的实时同步,即只要将任何产品添加到我的mongodb,我希望它在solr索引中更新并反映在搜索结果中。

I am using MongoDB 2.0.4 and Solr 3.6.1. 我使用的是MongoDB 2.0.4和Solr 3.6.1。

Hadoop is an option for creating SOLR indexes. Hadoop是创建SOLR索引的选项。 I haven't done this first hand, but have heard from people such as etsy who are. 我没有做过第一手资料,但是从etsy这样的人那里听说过。

On this course at lucene revolution they talked about using hadoop to update the indexes in some SOLR cores. 在lucene革命的这个课程中 ,他们谈到了使用hadoop来更新一些SOLR核心中的索引。 Unfortunately I don't think the course material is publicly available. 不幸的是,我不认为课程材料是公开的。

And at this talk the speaker talked about the mongo/hadoop support. 这次演讲中 ,演讲者谈到了mongo / hadoop的支持。

Other related links: 其他相关链接:

Did you set the replica set mode? 你设置了副本集模式了吗? http://docs.mongodb.org/manual/reference/replica-configuration/ http://docs.mongodb.org/manual/reference/replica-configuration/

In the beginning, I was getting the same output as you described although there were no data in Solr. 虽然Solr中没有数据,但开始时我得到的输出与您描述的相同。 After, I set up replication mode, it seems that oplog file was created and mongodbconnector was correctly synchronizing with SOLR. 之后,我设置了复制模式,似乎创建了oplog文件,并且mongodbconnector正在与SOLR正确同步。 Works quite nicely for me. 对我来说效果很好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM