简体   繁体   English

在Eclipse中运行索引器作业时出现错误“缺少elastic.cluster和elastic.host…”

[英]Getting error “Missing elastic.cluster and elastic.host…” while running indexer job in nutch in eclipse

I have configured apache nutch 1.13 with solr 5.5.0 and hbase 0.90.6 in eclipse. 我已经在Eclipse中使用solr 5.5.0和hbase 0.90.6配置了Apache Nuch 1.13。 Now, I am able to run the jobs from injector to invertlinks, but while running indexing job it throws error "Missing elastic.cluster and elastic.host....". 现在,我可以运行从注入器到反向链接的作业,但是在运行索引作业时,它将引发错误“缺少elastic.cluster和elastic.host ....”。 I have set indexer-solr under plugin.includes in nutch-site.xml file. 我在nutch-site.xml文件的plugin.includes下设置了indexer-solr。 But still getting these error. 但是仍然出现这些错误。 Can anybody help me why this is happening? 有人可以帮助我为什么会这样吗?

The problem is with the nutch-site.xml. 问题出在nutch-site.xml。 If you see there are two nutch-site.xml; 如果看到有两个nutch-site.xml; one is under the conf folder and other is in src/test folder. 一个在conf文件夹下,另一个在src / test文件夹下。 We generally configure nutch-site.xml file under conf folder but when we import it in eclipse, it considers that file under src/test folder. 我们通常在conf文件夹下配置nutch-site.xml文件,但是当我们将其导入eclipse时,它将在src / test文件夹下考虑该文件。 So the way to fix this error is to configure your setting under src/test folder. 因此,解决此错误的方法是在src / test文件夹下配置您的设置。 Generally that file contains very basic config, you need to replace 通常,该文件包含非常基本的配置,您需要替换

<property>
    <name>plugin.includes</name>
    <value>.*</value>
    <description>Enable all plugins during unit testing.</description>
</property>

with below lines 与下面的线

<property>
    <name>plugin.includes</name>
    <value>protocol-http|urlfilter-regex|parse-(html|tika)|index-(basic|anchor)|indexer-solr|scoring-opic|urlnormalizer-(pass|regex|basic)</value>
    <description>Regular expression naming plugin directory names to
    include.  Any plugin not matching this expression is excluded.
    In any case you need at least include the nutch-extensionpoints plugin. By
    default Nutch includes crawling just HTML and plain text via HTTP,
    and basic indexing and search plugins. In order to use HTTPS please enable 
    protocol-httpclient, but be aware of possible intermittent problems with the 
    underlying commons-httpclient library. Set parsefilter-naivebayes for classification based focused crawler.
    </description>
</property>

So if you want to use solr then use indexer-solr, elastic then indexer-elastic and so on. 因此,如果要使用solr,请先使用indexer-solr,elastic然后再使用indexer-elastic等。

Hope this help others. 希望这对别人有帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在Eclipse中运行Nutch - 缺少构建文件夹 - Running Nutch in Eclipse - Missing Build Folder 在eclipse上运行android应用程序时出现此错误 - getting this error while running android application on eclipse Eclipse插件教程中的“无法将Elastic IP与集群相关联” - “Unable to associated Elastic IP with cluster” in Eclipse Plugin Tutorial 缺少Eclipse-&gt;选项“部署到AWS Elastic Beanstalk。” - Eclipse -> Option “Deploy to AWS Elastic Beanstalk.” missing 从Linux计算机上配置的Hadoop集群上的Windows计算机在Eclipse中运行mapReduce作业时出现问题 - Issue while running mapReduce job in Eclipse from Windows machine on a Hadoop cluster configured in Linux machine Eclipse中的Nutch错误 - Nutch error in Eclipse Eclipse(Juno)CDT索引器:“解析时出错...”一个Makefile项目 - Eclipse (Juno) CDT indexer: “Error while parsing…” a Makefile project 从eclipse提交作业到Amazon EMR上正在运行的集群 - submit a job from eclipse to a running cluster on amazon EMR 从Eclipse运行应用程序时在tomcat中获取错误404 - Getting error-404 in tomcat while running application from eclipse 在Eclipse中运行nutch1.9出现错误CrawlDb更新:java.io.IOException:作业失败 - run nutch1.9 in eclipse got error CrawlDb update: java.io.IOException: Job failed
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM