简体   繁体   English

nutch fetch 失败并出现 java.lang.NumberFormatException

[英]nutch fetch is failing with java.lang.NumberFormatException

I am running Nutch 1.18 on Red Hat Enterprise Linux release 8.3 (Ootpa) w/ Java openjdk version "1.8.0_275"我在 Red Hat Enterprise Linux 版本 8.3 (Ootpa) 上运行 Nutch 1.18,带有 Java openjdk 版本“1.8.0_275”

I am following these directions: https://cwiki.apache.org/confluence/display/NUTCH/NutchTutorial#NutchTutorial-Step-by-Step:Concepts我遵循这些方向: https://cwiki.apache.org/confluence/display/NUTCH/NutchTutorial#NutchTutorial-Step-by-Step:Concepts

When I get to the step for bin/nutch fetch $s1 every fetch is failing.当我到达bin/nutch fetch $s1的步骤时,每次获取都失败了。 See a sample error from the hadoop log below.请参阅下面的 hadoop 日志中的示例错误。 They all fail with java.lang.NumberFormatException.它们都因 java.lang.NumberFormatException 而失败。 I can use curl to check that the urls are accessible and they are.我可以使用 curl 来检查 URL 是否可以访问,并且它们可以访问。

Any advice would be appreciated.任何意见,将不胜感激。

    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.lang.Integer.parseInt(Integer.java:583)
    at java.lang.Integer.parseInt(Integer.java:615)
    at org.apache.hadoop.conf.Configuration.getInt(Configuration.java:1486)
    at org.apache.nutch.protocol.http.api.HttpBase.setConf(HttpBase.java:212)
    at org.apache.nutch.protocol.http.Http.setConf(Http.java:52)
    at org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:169)
    at org.apache.nutch.protocol.ProtocolFactory.getProtocolInstanceByExtension(ProtocolFactory.java:177)
    at org.apache.nutch.protocol.ProtocolFactory.getProtocol(ProtocolFactory.java:155)
    at org.apache.nutch.fetcher.FetcherThread.run(FetcherThread.java:308)```

The stack (keywords: protocol, http, configuration, parseInt) already tells that some integer value of a configuration property failed to read.堆栈(关键字:协议、http、配置、parseInt)已经告诉我们无法读取配置属性的某些 integer 值。 When looking into the source code (HttpBase.java, line 212) it becomes clear that it's about the configuration property "http.timeout":查看源代码(HttpBase.java,第 212 行)时,很明显它与配置属性“http.timeout”有关:

<property>
  <name>http.timeout</name>
  <value>10000</value>
  <description>The default network timeout, in milliseconds.</description>
</property>

Please verify that it is configured correctly - an integer value and a reasonable time span.请验证它是否配置正确 - integer 值和合理的时间跨度。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Nutch问题:java.lang.NoClassDefFoundError - Nutch problem: java.lang.NoClassDefFoundError 在获取阶段的第二轮之后,nutch 作业失败了吗? - nutch jobs failing after second round that is in fetch stage? Nutch 爬行给出错误“来自 http://localhost:8983/solr/nutch 的服务器错误:java.lang.NullPointerException” - Nutch crawling giving error "Error from server at http://localhost:8983/solr/nutch: java.lang.NullPointerException" nutch-1.18 错误 java.lang.NoClassDefFoundError: org/apache/nutch/storage/WebPage$Field - nutch-1.18 error java.lang.NoClassDefFoundError: org/apache/nutch/storage/WebPage$Field java.lang.NullPointerException(nutch 2.2.1和MySql作为数据存储区) - java.lang.NullPointerException ( nutch 2.2.1 and MySql as datastore) java.lang.RuntimeException: org.apache.nutch.plugin.PluginRuntimeException: java.lang.ClassNotFoundException 当用 nutch 解析时 - java.lang.RuntimeException: org.apache.nutch.plugin.PluginRuntimeException: java.lang.ClassNotFoundException when parsing with nutch Apache Nutch 索引器插件到 Manticore 搜索异常:java.lang.NoClassDefFoundError:com/manticoresearch/client/ApiException - Apache Nutch Indexer Plugin to Manticore Search Exception: java.lang.NoClassDefFoundError: com/manticoresearch/client/ApiException Nutch-获取新发现的域 - Nutch - fetch new discovered domains Nutch 与 Java 11 的兼容性 - Nutch compatibility with Java 11 Nutch + Solr-索引器导致java.lang.OutOfMemoryError:Java堆空间 - Nutch + Solr - Indexer causes java.lang.OutOfMemoryError: Java heap space
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM