I'm trying to use Google cloud translate. I think the problem is that Google cloud translate use UTF8 and the jvm use UTF16. So i got some typo in translations. For instance :
public static void main(String... args) throws Exception {
// Instantiates a client
Translate translate = TranslateOptions.getDefaultInstance().getService();
// The text to translate
String text = "Bonjour, à qui dois-je répondre? Non, C'est l'inverse...";
// Translates some text into Russian
Translation translation =
translate.translate(
text,
TranslateOption.sourceLanguage("fr"),
TranslateOption.targetLanguage("en"));
System.out.printf("Text: %s%n", text);
System.out.printf("Translation: %s%n", StringEscapeUtils.unescapeHtml(translation.getTranslatedText()));
}
will return :
"Translation: Hello, who should I answer? No, it's the opposite ..."
instead of :
Translation: Hello, who should I answer? No, it's the opposite ...
We can't change the encoding of a java String, and the Google Cloud Api will not accept anything (Byte[]?) but String.
Do someone know how to fix it?
Thank you for reading
Edit : This code is now working, I added the StringEscapeUtils.unescapeHtml from commons.apache dependencies. I do not know if there is an other way to do it.
It's not a problem of UTF8 / UTF16.
The answer of google is html encoded.
https://en.wikipedia.org/wiki/Unicode_and_HTML
This is common if you want to transmit unicode character using only ASCII in a xml/html context .
Even though you already found a solution to your problem, I do have another fix for your problem which does not require the use of an additional library.
The translate method returns a html encoded String by default as previously mentioned. But it can return a plain text String if the matching TranslateOption is given in the method call.
The method call will then look something like this.
Translation translation = translate.translate(
text,
Translate.TranslateOption.sourceLanguage(from),
Translate.TranslateOption.targetLanguage(to),
Translate.TranslateOption.format("text")
);
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.