简体   繁体   中英

Setting path in heidelTime property file to use Stanford POS Tagger for German?

I am trying to detect temporal information in German text. I tried using the Stanford CoreNLP pipeline as it would be very helpful to make use of dependency parse information in later stages (after temporal tagging) but to my understanding there is no way of setting the integrated temporal tagger of CoreNLP to German. Am I right about this or is there, in fact, a way to do this.

Now I'm trying to use HeidelTime to retrieve tamporal tags seperately. I want to use the Stanford POS tagger with it. In the Heideltime config.props file, I am setting the path to the Stanford POS tagger like this (using windows):

model_path = C:\\Users\\milu\\Documents\\stanford-postagger-full-2017-06-09\\stanford-postagger-full-2017-06-09\\models
# leave this unset if you do not need one (e.g., /home/jannik/stanford-postagger-full-2014-01-04/tagger.config)
config_path =   

This is the code I'm running on NetBeans, followed by the error I get. Is there something wrong with the way I am specifying the path to the POS tagger?

public class RunHeideltimeInJava {


public static void main(String[] args) throws
        DocumentCreationTimeMissingException, ParseException {

    OutputType outtype = OutputType.XMI;
    POSTagger postagger = POSTagger.STANFORDPOSTAGGER;
    String conffile = "C:\\Users\\milu\\Documents\\NetBeansProjects\\TimeTagging\\src\\config.props";

    HeidelTimeStandalone hsNarratives = new HeidelTimeStandalone(Language.GERMAN,
            DocumentType.NARRATIVES, outtype, conffile, postagger);

    String narrativeText = "Ich habe letztes Wochenende neue Schuhe gekauft.";

    String xmiNarrativeOutput = hsNarratives.process(narrativeText);
    System.err.println("NARRATIVE*****" + xmiNarrativeOutput);
    String dctString = "2016-04-29";
    DateFormat df = new SimpleDateFormat("yyyy-MM-dd");
    Date dct = df.parse(dctString);
 }
}

Output:

run:
Aug 25, 2017 9:54:31 AM de.unihd.dbs.heideltime.standalone.HeidelTimeStandalone initialize
INFORMATION: HeidelTimeStandalone initialized with language german
Aug 25, 2017 9:54:31 AM de.unihd.dbs.heideltime.standalone.HeidelTimeStandalone readConfigFile
INFORMATION: trying to read in file C:\Users\milue\Documents\NetBeansProjects\TimeTagging\src\config.props
Aug 25, 2017 9:54:33 AM de.unihd.dbs.heideltime.standalone.HeidelTimeStandalone initialize
INFO: HeidelTime initialized
Aug 25, 2017 9:54:33 AM de.unihd.dbs.heideltime.standalone.HeidelTimeStandalone initialize
INFO: JCas factory initialized
Aug 25, 2017 9:54:33 AM de.unihd.dbs.heideltime.standalone.HeidelTimeStandalone process
INFO: Processing started
Exception in thread "main" java.lang.NoClassDefFoundError: edu/stanford/nlp/tagger/maxent/TaggerConfig
    at de.unihd.dbs.heideltime.standalone.components.impl.StanfordPOSTaggerWrapper.<init>(StanfordPOSTaggerWrapper.java:12)
    at de.unihd.dbs.heideltime.standalone.HeidelTimeStandalone.establishPartOfSpeechInformation(HeidelTimeStandalone.java:391)
    at de.unihd.dbs.heideltime.standalone.HeidelTimeStandalone.establishHeidelTimePreconditions(HeidelTimeStandalone.java:332)
    at de.unihd.dbs.heideltime.standalone.HeidelTimeStandalone.process(HeidelTimeStandalone.java:516)
    at de.unihd.dbs.heideltime.standalone.HeidelTimeStandalone.process(HeidelTimeStandalone.java:449)
    at RunHeideltimeInJava.main(RunHeideltimeInJava.java:29)
Caused by: java.lang.ClassNotFoundException: edu.stanford.nlp.tagger.maxent.TaggerConfig
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 6 more
C:\Users\milu\AppData\Local\NetBeans\Cache\8.2\executor-snippets\run.xml:53: Java returned: 1
BUILD FAILED (total time: 2 seconds)

According to the Heideltime manual, you just have to set the language option to German : java -jar de.unihd.dbs.heideltime.standalone.jar -l GERMAN . Heideltime will then set this option to the chosen POS tagger (TreeTagger or StanfordPosTagger).

Regarding the TaggerConfig error, I am having the same message when calling Heidletime on the command line with StanfordPosTagger even for english text: java -jar de.unihd.dbs.heideltime.standalone.jar reference.txt -pos STANFORDPOSTAGGER .

I followed the instructions by editing the heideltime config file and adding the the Stanford POS Tagger .jar file to the CLASSPATH: https://github.com/HeidelTime/heideltime/wiki/StanfordPOSTaggerWrapper

Everything works well when I use TreeTagger for Part Of Speech tagging.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM