简体   繁体   中英

Splitting string into sentences using regex

I'm trying to split a paragraph into sentences. At the moment I'm splitting by . which works fine but I can't seem to get it to split correctly when there's either . or ? or !

So far my code is:

String[] sentences = everything.split("(?<=[a-z])\\.\\s+");

Thanks

If you don't want to remove . , ! , ? from the results.

    String[] sentences = everything.split("(?<=[a-z][!?.])\\s+"); 

Use a character class, and you don't need the look behind - use a word boundary instead:

String[] sentences = everything.split("\\b[.!?]\\s+");

"[.!?]" means "either . , ! or ? ". The word boundary \\b requires that a word character precede the end of sentence char.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM