简体   繁体   中英

Using POS-tagged string with ANTLR

I have a string which I am tagging using Stanford CoreNLP POS Tagger. Now, I want the words in the string to act like values (terminals) and the corresponding tag of each word to act as the token (non-terminal).

For example if the tagged string is:

Three/CD, critics/NNS, review/VBP, a/DT, book/NN, ./.

Then, I want these tags (CD, NNS, VBP, etc.) to be passed to ANTLR 4 parser as token stream with the words in the sentence (Three, critics, review, etc.) being values of these tokens which can be inferred later.

I want to ask if there is an efficient way I can parse the sentence using ANTLR but use my own methods to supply the tokens?

PS: I am using ANTLR 4 IDE plugin for Eclipse. No tag for it yet!

Extend CommonToken to create your own custom token with methods and fields appropriate for the tagging. Extend CommonTokenFactory to supply your tokens to the lexer when called; use Lexer.setTokenFactory() to make the factory available.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM