Stanford CoreNLP can not detect sentence with numbering

Question

I have a word document with numbering like 1. ,2. etc. I want to extract sentences from the document. I use Stanford CoreNLP 4.0.0 and stanford-corenlp-models-current.jar Normal extraction of sentences retrieve numbers as different sentence. Suppose document has

Abcd efgh....
Ijkl mnop....

Sentence extraction gets 1 as a sentence and Abcd efgh as another sentence.

Similarly 2 as a sentence and Ijkl mnop as another sentence.

I try with boundariesToDiscard properties with different patterns but get same result and also get wrong entity mentions in this case.

Please help to resolve this issue.

Thanks in advance.

Answer 1

I solve the problem. I just set the following property

props.setProperty("ssplit.eolonly", "true");

Stanford CoreNLP can not detect sentence with numbering

Question

1 answers

solution1
0 2020-07-15 08:25:26

Stanford CoreNLP can not detect sentence with numbering

Question

1 answers

solution1 0 2020-07-15 08:25:26

solution1
0 2020-07-15 08:25:26