简体   繁体   English

从 tika-app 调用 ctakes 解析器时出现异常

[英]while invoking ctakes parser from tika-app getting exception

While invoking cTAKES parser from tika-app getting following:从 tika-app 调用 cTAKES 解析器时,得到以下信息:

java -classpath $HOME/src/ctakes-config:${TIKA_HOME}/tika-app/target/tika-app-X.Y-SNAPSHOT.jar:${CTAKES_HOME}/desc:${CTAKES_HOME}/resources:${CTAKES_HOME}/lib/* org.apache.tika.cli.TikaCLI --config=$HOME/src/ctakes-config/tika-config.xml -m Vose-2013-American_Journal_of_Hematology.pdf

Exception例外

Screenshot of Exception java.lang.NoSuchMethodError异常 java.lang.NoSuchMethodError 的截图

在此处输入图片说明

Exception in thread "main" java.lang.NoSuchMethodError: opennlp.tools.sentdetect.SentenceModel.getMaxentModel()Lopennlp/model/AbstractModel;线程“main”中的异常 java.lang.NoSuchMethodError: opennlp.tools.sentdetect.SentenceModel.getMaxentModel()Lopennlp/model/AbstractModel;

I have followed the steps mentioned in this link .我已按照 此链接中提到的步骤操作。 I am unable to understand the cause of this error and hence how to resolve this.我无法理解此错误的原因以及如何解决此问题。

I am also getting following warning: Warning我还收到以下警告:警告

Feb 16, 2020 12:19:58 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: J2KImageReader not loaded. 2020 年 2 月 16 日下午 12:19:58 org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem 警告:未加载 J2KImageReader。 JPEG2000 files will not be processed. JPEG2000 文件将不会被处理。 See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io for optional dependencies.有关可选依赖项,请参阅https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io

Feb 16, 2020 12:19:59 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: org.xerial's sqlite-jdbc is not loaded. 2020 年 2 月 16 日下午 12:19:59 org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem 警告:org.xerial 的 sqlite-jdbc 未加载。 Please provide the jar on your classpath to parse sqlite files.请在您的类路径上提供 jar 来解析 sqlite 文件。 See tika-parsers/pom.xml for the correct version.有关正确版本,请参阅 tika-parsers/pom.xml。

I have tried to resolve it using answers in this link , but it wasn't of much help.我曾尝试使用此链接中的答案解决它,但没有太大帮助。 i know these are only warnings and hope are not causing the error and am using tika only by installing it我知道这些只是警告,希望不会导致错误,我只通过安装来使用 tika

System Information系统信息

  • OS ubuntu 16.04操作系统 Ubuntu 16.04
  • JDK openJDK8. JDK openJDK8.
  • Maven 3.3.9 Maven 3.3.9
  • Apache tika 1.23阿帕奇提卡 1.23
  • Apache cTAKES 3.2.2 Apache cTAKES 3.2.2

I've addressed this.我已经解决了这个问题。 It had to do with incompatible versions of the Apache OpenNLP library.它与不兼容的 Apache OpenNLP 库版本有关。 The Tika CTAKES parser was pinned to 1.5.3, and cTAKES 3.2.2 uses that version, but Tika Parsers has since evolved to use a newer version. Tika CTAKES 解析器被固定到 1.5.3,cTAKES 3.2.2 使用那个版本,但 Tika Parsers 已经发展到使用更新的版本。

The fix was to reference the older OpenNLP 1.5.3 jar in the classpath.修复是在类路径中引用旧的 OpenNLP 1.5.3 jar。 I have updated the wiki here: https://cwiki.apache.org/confluence/display/TIKA/CTAKESParser我在这里更新了维基: https : //cwiki.apache.org/confluence/display/TIKA/CTAKESParser

java -classpath $HOME/src/ctakes-config:${CTAKES_HOME}/lib/opennlp-tools-1.5.3.jar:${TIKA_HOME}/tika-app/target/tika-app-X.Y-SNAPSHOT.jar:${CTAKES_HOME}/desc:${CTAKES_HOME}/resources:${CTAKES_HOME}/lib/\* org.apache.tika.cli.TikaCLI \
--config=$HOME/src/ctakes-config/tika-config.xml \
-m Vose-2013-American_Journal_of_Hematology.pdf 

安装 Apache tika-1.10 后,我能够从 tika 应用程序调用 cTAKES cTAKES 和 TIKA 的两个版本都不兼容

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM