简体   繁体   中英

Lucene indexing on Hadoop file system (HDFS)

I am in a need to merge the Lucene indexes kept on HDFS. Wrote the customized version of the normal merge tool provided by Lucene. Code base is given below

HdfsDirectory mergedIndex =  new HdfsDirectory(new Path("/mergedindex"), new Configuration());
IndexWriter writer = new IndexWriter(mergedIndex, new IndexWriterConfig(new WhitespaceAnalyzer(Version.LUCENE_CURRENT))
    .setOpenMode(OpenMode.CREATE));

Directory[] indexes = new BaseDirectory[args.length - 1];
for (int i = 1; i < args.length; i++) {
  indexes[i  - 1] = new HdfsDirectory(new Path(args[i]), new Configuration());
}

System.out.println("Merging...");
writer.addIndexes(indexes);

System.out.println("Full merge...");
writer.forceMerge(1);
writer.close();

But it says it cannot get a HDFS lock on the directory because it is a timeout ! the time out value is hardcoded in the Lucene library as 1000 milli second.

Exception trace Exception in thread "main" org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: org.apache.solr.store.hdfs.HdfsLockFactory$HdfsLock@21539796 at org.apache.lucene.store.Lock.obtain(Lock.java:89) at org.apache.lucene.index.IndexWriter.(IndexWriter.java:776) at com.test.hadoop.solr.indexer.IndexMergeTool.main(IndexMergeTool.java:30) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

Is there any mechanism to overcome this so that I can merge the index on HDFS itself?

Thanks in advance, Arun

请确保删除索引文件夹下的锁定文件并尝试。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM