简体繁体中英

How to replace a non English character with English character

原文 2020-06-10 05:40:22 5 1 java/ python/ php

I have got a weird problem. I'm getting text from Google cloud vision containing non English characters but they are actually English characters. It is a mistake from Google cloud vision OCR.

I'm getting a character like this: Héllo

Notice that é is non English character.

I want to convert into simple "Hello" so I can process this word.

I'm not looking for the programming answer. I'm just looking for ways to do this.

Any hint would be useful.

Thanks!

1 answers

If Apache Commons is an option for you, you could make use of their StringUtils library. The stripAccents method should suit your needs. From the source code you can see that it actually makes use of java.text.Normalizer , so you could also look into that.

Replace non english character in a string with utf-8 character in Android / Java

How to escape the non english character at server side

non-english character displaylike this?

how to read and store non-English character in java on windows os

xstream handles non-english character

Pass data of non english character to web server

Grabbing a non-english character in jexcelapi

Jsoup Whitelist: Parsing non-english character

UTF-8 string converts non english character to invalid character

Apache POI or java.io support non-English character or not?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Replace non english character in a string with utf-8 character in Android / Java How to escape the non english character at server side non-english character displaylike this? how to read and store non-English character in java on windows os xstream handles non-english character Pass data of non english character to web server Grabbing a non-english character in jexcelapi Jsoup Whitelist: Parsing non-english character UTF-8 string converts non english character to invalid character Apache POI or java.io support non-English character or not?

Related Tags

How to replace a non English character with English character

Question

1 answers

solution1 0 ACCPTED 2020-06-10 05:50:09

solution1
0 ACCPTED 2020-06-10 05:50:09