简体   繁体   中英

How do I determine a word boundary in Unicode stream in C#?

I'm reading a Unicode stream and would rather not have to pass the entire string through a regex. Is there a simple (reliable) character I can use to break words across languages?

My byte array is likely going to be based in UTF-16 or UTF-8

如果使用Java,则可以使用BreakIterator

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM