简体   繁体   中英

Character Encoding Conversion In Groovy From UTF-8 to EUC-JP

We require character encoding conversion for one of our service, our requirement is to fetch characters in UTF-8 encoded format and should convert to EUC-JP then prepare some hashing on (Groovy based on) jdk8.

In php, similar solution works fine for us and coded as,

$encodedToEucJp = mb_convert_encoding($inputStringWithUtf8, “EUC-JP”);
Print_r(md5($encodedToEucJp));

We have tried many ways for the solution, eg,

Java.security.MessageDigest.getInstance(‘MD5’)
.digest(New String(inputStringWithUtf8.getBytes(“UTF-8”), “EUC-JP”)
.getBytes(“EUC-JP”))
.encodeHex()
.toString();

But, this solution failed for some of the characters that produces different digest then from our php coded solution. Here few characters are mentioned ―, ĭ, ? etc. That's the reason why we couldn't product same digest with same input both in php and java system.

Thanks, in advance.

The error is in this part of the code:

New String(inputStringWithUtf8.getBytes(“UTF-8”), “EUC-JP”)

Basically, you try to interpret an UTF-8 byte array as if it were encoded in EUC-JP, which is a non-sense.

The following code should do the job

    Java.security.MessageDigest.getInstance(‘MD5’)
            .digest(inputStringWithUtf8.getBytes(“EUC-JP”))
            .encodeHex()
            .toString();

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM