[英]impl.ConcurrentUpdateSolrServer: Status for: {file-path}is 404
I want to index my a corpus using solr. 我想使用solr为我的语料库建立索引。
To create a sequence file, I used the following command: 要创建序列文件,我使用了以下命令:
./behemoth -i file://path/to/my/file/where/the corpus/is/located -o /user/user-name/file-to-which-the-putput-is-stored
After this I gave the following command for indexing: 在此之后,我给出了以下用于索引的命令:
./behemoth solr /user/user-name/pTH-to-which-output-is-stored-in-previous-command http://localhost:8983/solr ./behemoth solr /用户/用户名/ pTH-到-哪个输出存储在以前的命令中http:// localhost:8983 / solr
But its is giving the following error: 但是它给出了以下错误:
15/06/04 11:51:07 INFO mapreduce.Job: Job job_local183059797_0001 running in uber mode : false
15/06/04 11:51:07 INFO mapreduce.Job: map 0% reduce 0%
15/06/04 11:51:08 INFO mapred.LocalJobRunner:
15/06/04 11:51:08 INFO impl.ConcurrentUpdateSolrServer: Status for: file:///usr/local/ASR/data/Corpus/en_TheTelegraph_2001-2010/telegraph_2007-2010/telegraph_1st_oct_2007_to_31st_dec_2007/foreign/1071015_foreign_story_8435523.utf8 is 404
15/06/04 11:51:08 ERROR impl.ConcurrentUpdateSolrServer: error
java.lang.Exception: Not Found
I am unable to figure out the issue as the above mentioned file exists on that path. 由于上述文件存在于该路径中,因此我无法确定问题。 Please help
请帮忙
Just found your question, best to ask on the DigitalPebble mailing list or open an issue on GitHub. 刚找到您的问题,最好在DigitalPebble邮件列表上询问或在GitHub上打开问题。
I don't think the problem is related to the content of the input. 我认为问题与输入的内容无关。 Looks more like it can't connect to SOLR.
看起来更像无法连接到SOLR。
Also you've imported a corpus of documents but no text or metadata have been extracted as part of the import. 另外,您已经导入了文档语料库,但是没有文本或元数据被提取为导入的一部分。 You should run the Tika module on your input first.
您应该首先在输入上运行Tika模块。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.