We require character encoding conversion for one of our service, our requirement is to fetch characters in UTF-8 encoded format and should convert to EUC-JP then prepare some hashing on (Groovy based on) jdk8.
In php, similar solution works fine for us and coded as,
$encodedToEucJp = mb_convert_encoding($inputStringWithUtf8, “EUC-JP”);
Print_r(md5($encodedToEucJp));
We have tried many ways for the solution, eg,
Java.security.MessageDigest.getInstance(‘MD5’)
.digest(New String(inputStringWithUtf8.getBytes(“UTF-8”), “EUC-JP”)
.getBytes(“EUC-JP”))
.encodeHex()
.toString();
But, this solution failed for some of the characters that produces different digest then from our php coded solution. Here few characters are mentioned ―, ĭ, ? etc. That's the reason why we couldn't product same digest with same input both in php and java system.
Thanks, in advance.
The error is in this part of the code:
New String(inputStringWithUtf8.getBytes(“UTF-8”), “EUC-JP”)
Basically, you try to interpret an UTF-8 byte array as if it were encoded in EUC-JP, which is a non-sense.
The following code should do the job
Java.security.MessageDigest.getInstance(‘MD5’)
.digest(inputStringWithUtf8.getBytes(“EUC-JP”))
.encodeHex()
.toString();
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.