简体繁体 English

如何以mapReduce方式建立Lucene索引？

[英]How to build lucene index in a mapReduce way?

原文 2014-04-19 20:41:18 6 1 hadoop/ solr/ lucene/ indexing/ mapreduce

I am building a small image similarity search application with hadoop. 我正在使用hadoop构建小型图像相似性搜索应用程序。 I decide to use LIRE which in this demo code, it uses lucene indexWriter to write index to a local disk. 我决定使用LIRE ，在此演示代码中，它使用lucene indexWriter将索引写入本地磁盘。 What I have done now is making my reducers generate the LIRE records. 我现在要做的是让我的减速器生成LIRE记录。 but how to make reducers write these records to a Lucene index file in HDFS? 但是如何使减速器将这些记录写入HDFS中的Lucene索引文件？ I googled and find some tools like solrCloud, Blur, but there is no good document and code example to show how to do it. 我用谷歌搜索并找到了一些诸如solrCloud，Blur之类的工具，但是没有很好的文档和代码示例来演示如何做到这一点。

Does anyone know some good reference? 有谁知道一些好的参考资料？

PS. PS。 I notice there is a question with similarity title, but it was from 3 years ago, and the answers are not clear. 我注意到有一个标题相似的问题，但这是3年前的，答案还不清楚。

1 个解决方案

If you are using Solr 4.7 there is a option do index using HDFS using kite morpholines code. 如果您使用的是Solr 4.7，则可以选择使用风筝吗啉代码使用HDFS进行索引。 This is part of Solr distribution now (>4.7). 现在，这是Solr发行版的一部分（> 4.7）。 Look at this JIRA for more information. 有关更多信息，请查看此JIRA。 https://issues.apache.org/jira/browse/SOLR-5729 https://issues.apache.org/jira/browse/SOLR-5729

Also look at the earlier git repository https://github.com/markrmiller/solr-map-reduce-example 还要看看早期的git仓库https://github.com/markrmiller/solr-map-reduce-example