简体   繁体   English

来自Java应用程序的Apache cTAKES逻辑

[英]Apache cTAKES logic from Java application

I'm trying to embed Apache cTAKES NLP logic into my application. 我正在尝试将Apache cTAKES NLP逻辑嵌入到我的应用程序中。

First of all, I'm unable to find any good documentation in order to be educated how it can be done. 首先,我无法找到任何好的文档,以便了解它是如何完成的。

From the different pieces of code that I found on the internet I have created the following test code: 从我在互联网上找到的不同代码片段中,我创建了以下测试代码:

public class CTAKESTest {

    public static void main(String[] args) throws UIMAException, MalformedURLException {

        final String note = "Serum Cholesterol 154 150 250 mgs/dl\n-\nSerum Triglycerides 67 90 200 mgs /dl\n-\nSerum HDL: Cholesterol 38 35 55 mgs /dl\n-\nSerum LDL: Cholesterol 49 85 150 mgs/d1\n-\nSerum VLDL: Cholesterol 13 10 40 mgs/dl\n-\nTotal Cholesterol / HDL Ratio: 3.90";

        final JCas jcas = JCasFactory.createJCas();
        jcas.setDocumentText(note);

        final AnalysisEngineDescription aed = getFastPipeline();
        SimplePipeline.runPipeline(jcas, aed);

        Collection<TOP> codes = JCasUtil.selectAll(jcas);
        List<TOP> list = new ArrayList(codes);

        TOP[] res = list.toArray(new TOP[list.size()]);
        // System.out.println(Arrays.toString(res));
        String json = new Gson().toJson(res);
        System.out.println(json);
    }

    public static AnalysisEngineDescription getFastPipeline()
            throws ResourceInitializationException, MalformedURLException {
        AggregateBuilder builder = new AggregateBuilder();
        builder.add(getTokenProcessingPipeline());
        builder.add(DefaultJCasTermAnnotator.createAnnotatorDescription());
        builder.add(ClearNLPDependencyParserAE.createAnnotatorDescription());
        builder.add(PolarityCleartkAnalysisEngine.createAnnotatorDescription());
        builder.add(UncertaintyCleartkAnalysisEngine.createAnnotatorDescription());
        builder.add(HistoryCleartkAnalysisEngine.createAnnotatorDescription());
        builder.add(ConditionalCleartkAnalysisEngine.createAnnotatorDescription());
        builder.add(GenericCleartkAnalysisEngine.createAnnotatorDescription());
        builder.add(SubjectCleartkAnalysisEngine.createAnnotatorDescription());
        return builder.createAggregateDescription();
    }

    public static AnalysisEngineDescription getTokenProcessingPipeline()
            throws ResourceInitializationException, MalformedURLException {
        AggregateBuilder builder = new AggregateBuilder();
        builder.add(SimpleSegmentAnnotator.createAnnotatorDescription());
        builder.add(SentenceDetector.createAnnotatorDescription());
        builder.add(TokenizerAnnotatorPTB.createAnnotatorDescription());
        builder.add(LvgAnnotator.createAnnotatorDescription());
        builder.add(ContextDependentTokenizerAnnotator.createAnnotatorDescription());
        builder.add(POSTagger.createAnnotatorDescription());
        return builder.createAggregateDescription();
    }

}

but it fails during startup with the following error: 但它在启动过程中失败并出现以下错误:

08:37:01.978 [main] INFO  o.apache.ctakes.lvg.ae.LvgAnnotator - URL for lvg.properties =file:/C:/Users/Alex/.m2/repository/net/sourceforge/ctakesresources/ctakes-resources-lvg2008/4.0.0/ctakes-resources-lvg2008-4.0.0.jar!/org/apache/ctakes/lvg/data/config/lvg.properties
08:37:03.454 [main] INFO  o.a.ctakes.core.ae.SentenceDetector - Sentence detector model file: org/apache/ctakes/core/sentdetect/sd-med-model.zip
08:37:03.566 [main] INFO  o.a.c.core.ae.TokenizerAnnotatorPTB - Initializing org.apache.ctakes.core.ae.TokenizerAnnotatorPTB
Exception in thread "main" java.lang.IllegalArgumentException: URI is not hierarchical
    at java.io.File.<init>(Unknown Source)
    at org.apache.ctakes.lvg.resource.LvgCmdApiResourceImpl.load(LvgCmdApiResourceImpl.java:65)
    at org.apache.uima.resource.impl.ResourceManager_impl.registerResource(ResourceManager_impl.java:628)
    at org.apache.uima.resource.impl.ResourceManager_impl.initializeExternalResources(ResourceManager_impl.java:464)
    at org.apache.uima.resource.Resource_ImplBase.initialize(Resource_ImplBase.java:193)
    at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.initialize(AnalysisEngineImplBase.java:157)
    at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:131)
    at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
    at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
    at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:279)
    at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:407)
    at org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:256)
    at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initASB(AggregateAnalysisEngine_impl.java:429)
    at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:373)
    at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:186)
    at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
    at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
    at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:279)
    at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:407)
    at org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:256)
    at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initASB(AggregateAnalysisEngine_impl.java:429)
    at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:373)
    at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:186)
    at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
    at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
    at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:279)
    at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:407)
    at org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:256)
    at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initASB(AggregateAnalysisEngine_impl.java:429)
    at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:373)
    at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:186)
    at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
    at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
    at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:279)
    at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:331)
    at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:448)
    at org.apache.uima.fit.factory.AnalysisEngineFactory.createEngine(AnalysisEngineFactory.java:205)
    at org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:227)
    at org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:260)

What am I doing wrong and how to fix it ? 我做错了什么以及如何解决? Also, how to properly configure cTAKES in order to use AggregatePlaintextFastUMLSProcessor.xml and my custom dictionary that I'm going to create also ? 另外,如何正确配置cTAKES以使用AggregatePlaintextFastUMLSProcessor.xml和我要创建的自定义词典呢?

I would request you to have a look at this cTAKES-REST module that meets your exact requirement. 我请你看看这个符合你确切要求的cTAKES-REST模块。 It can be invoked using a web service call and this can also be configured to make use of your custom dictionary too. 它可以使用Web服务调用调用,也可以配置为使用您的自定义字典。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM