MD5在Java中使用ISO-8859-1字符串哈希

Question

I'm implementing an interface for digital payment service called Suomen Verkkomaksut . 我正在实施名为Suomen Verkkomaksut的数字支付服务界面。 The information about the payment is sent to them via HTML form. 有关付款的信息将通过HTML表单发送给他们。 To ensure that no one messes with the information during the transfer a MD5 hash is calculated at both ends with a special key that is not sent to them. 为了确保在传输过程中没有人对信息感到困惑，MD5哈希在两端用一个未发送给他们的特殊密钥计算。

My problem is that for some reason they seem to decide that the incoming data is encoded with ISO-8859-1 and not UTF-8. 我的问题是，由于某种原因，他们似乎决定传入的数据是用ISO-8859-1而不是UTF-8编码的。 The hash that I sent to them is calculated with UTF-8 strings so it differs from the hash that they calculate. 我发送给它们的哈希是用UTF-8字符串计算的，因此它与它们计算的哈希值不同。

I tried this with following code: 我尝试使用以下代码：

String prehash = "6pKF4jkv97zmqBJ3ZL8gUw5DfT2NMQ|13466|123456||Testitilaus|EUR|http://www.esimerkki.fi/success|http://www.esimerkki.fi/cancel|http://www.esimerkki.fi/notify|5.1|fi_FI|0412345678|0412345678|esimerkki@esimerkki.fi|Matti|Meikäläinen||Testikatu 1|40500|Jyväskylä|FI|1|2|Tuote #101|101|1|10.00|22.00|0|1|Tuote #202|202|2|8.50|22.00|0|1";
String prehashIso = new String(prehash.getBytes("ISO-8859-1"), "ISO-8859-1");

String hash = Crypt.md5sum(prehash).toUpperCase(); 
String hashIso = Crypt.md5sum(prehashIso).toUpperCase();

Unfortunately both hashes are identical with value C83CF67455AF10913D54252737F30E21. 不幸的是，两个散列都与值C83CF67455AF10913D54252737F30E21相同。 The correct value for this example case is 975816A41B9EB79B18B3B4526569640E according to Suomen Verkkomaksut's documentation. 根据Suomen Verkkomaksut的文档，此示例案例的正确值为975816A41B9EB79B18B3B4526569640E。

Is there a way to calculate MD5 hash in Java with ISO-8859-1 strings? 有没有办法用ISO-8859-1字符串计算Java中的MD5哈希？

UPDATE: While waiting answer from Suomen Verkkomaksut, I found an alternative way to make the hash. 更新：在等待Suomen Verkkomaksut的回答时，我发现了另一种制作哈希的方法。 Michael Borgwardt corrected my understanding of String and encodings and I looked for a way to make the hash from byte[]. Michael Borgwardt纠正了我对字符串和编码的理解，并且我找到了一种从byte []创建哈希的方法。

Apache Commons is an excellent source of libraries and I found their DigestUtils class which has a md5hex function which takes byte[] input and returns a 32 character hex string. Apache Commons是一个很好的库源，我发现它们的DigestUtils类有一个md5hex函数，它接受byte []输入并返回一个32字符的十六进制字符串。

For some reason this still doesn't work. 由于某种原因，这仍然无效。 Both of these return the same value: 这两个都返回相同的值：

DigestUtils.md5Hex(prehash.getBytes());
DigestUtils.md5Hex(prehash.getBytes("ISO-8859-1"));

Answer 1

You seem to misunderstand how string encoding works, and your Crypt class's API is suspect. 您似乎误解了字符串编码的工作原理，并且您的Crypt类的API是可疑的。

Strings don't really "have an encoding" - an encoding is what you use to convert between Strings and bytes. 字符串实际上并没有“编码” - 编码就是用来在字符串和字节之间进行转换的。

Java Strings are internally stored as UTF-16, but that does not really matter, as MD5 works on bytes, not Strings. Java字符串在内部存储为UTF-16，但这并不重要，因为MD5适用于字节，而不是字符串。 Your Crypt.md5sum() method has to convert the Strings it's passed to bytes first - what encoding does it use to do that? 您的Crypt.md5sum()方法必须首先将它传递给字符串的字符串转换为字符 - 它使用什么编码来执行此操作？ That's probably the source of your problem. 这可能是你问题的根源。

Your example code is pretty nonsensical as the only effect this line has: 您的示例代码非常荒谬，因为此行具有唯一的效果：

String prehashIso = new String(prehash.getBytes("ISO-8859-1"), "ISO-8859-1");

is to replace characters that cannot be represented in ISO-8859-1 with question marks. 是用问号替换ISO-8859-1中无法表示的字符。

Answer 2

Java has a standard java.security.MessageDigest class, for calculating different hashes. Java有一个标准的java.security.MessageDigest类，用于计算不同的哈希值。

Here is the sample code 这是示例代码

include java.security.MessageDigest;

// Exception handling not shown

String prehash = ...

final byte[] prehashBytes= prehash.getBytes( "iso-8859-1" );

System.out.println( prehash.length( ) );
System.out.println( prehashBytes.length );

final MessageDigest digester = MessageDigest.getInstance( "MD5" );

digester.update( prehashBytes );

final byte[] digest = digester.digest( );

final StringBuffer hexString = new StringBuffer();

for ( final byte b : digest ) {
    final int intByte = 0xFF & b;

    if ( intByte < 10 )
    {
        hexString.append( "0" );
    }

    hexString.append(
        Integer.toHexString( intByte )
    );
}

System.out.println( hexString.toString( ).toUpperCase( ) );

Unfortunately for you it produces the same "C83CF67455AF10913D54252737F30E21" hash. 不幸的是，它产生相同的“C83CF67455AF10913D54252737F30E21”哈希值。 So, I guess your Crypto class is exonerated. 所以，我想你的Crypto类是免责的。 I specifically added the prehash and prehashBytes length printouts to verify that indeed 'ISO-8859-1' is used. 我特意添加了prehash和prehashBytes长度打印输出以验证确实使用了'ISO-8859-1'。 In this case both are 328. 在这种情况下，两者都是328。

When I did presash.getBytes( "utf-8" ) it produced "9CC2E0D1D41E67BE9C2AB4AABDB6FD3" (and the length of the byte array became 332). 当我做了presash.getBytes( "utf-8" )它产生了“9CC2E0D1D41E67BE9C2AB4AABDB6FD3”（并且字节数组的长度变为332）。 Again, not the result you are looking for. 再次，不是您正在寻找的结果。

So, I guess Suomen Verkkomaksut does some massaging of the prehash string that they did not document, or you have overlooked. 所以，我猜Suomen Verkkomaksut对一些他们没有记录的prehash字符串做了一些按摩，或者你忽略了。

Answer 3

Not sure if you solved your problem, but I had a similar problem with ISO-8859-1 encoded strings with nordic ä & ö characters and calculating a SHA-256 hash to compare with stuff in documentation. 不确定你是否解决了你的问题，但我对ISO-8859-1编码的字符串与北欧ä和ö字符有类似的问题，并计算SHA-256哈希与文档中的东西进行比较。 The following snippet worked for me: 以下代码段对我有用：

import java.security.MessageDigest;
//imports omitted

@Test
public void test() throws ProcessingException{
String test = "iamastringwithäöchars";           
System.out.println(this.digest(test));      
}

public String digest(String data) throws ProcessingException {
    MessageDigest hash = null;

    try{
        hash = MessageDigest.getInstance("SHA-256");
    }
    catch(Throwable throwable){
        throw new ProcessingException(throwable);
    }
    byte[] digested = null;
    try {
        digested = hash.digest(data.getBytes("ISO-8859-1"));
    } catch (UnsupportedEncodingException e) {
        e.printStackTrace();
    }

    String ret = BinaryUtils.BinToHexString(digested);
    return ret;
}

To transform bytes to hex string there are many options, including the apache commons codec Hex class mentioned in this thread. 要将字节转换为十六进制字符串，有许多选项，包括此线程中提到的apache commons编解码器Hex类。

Answer 4

If you send UTF-8 encoded data that they treat as ISO-8859-1 then that could be the source of your problem. 如果您发送他们视为ISO-8859-1的UTF-8编码数据，那么这可能是您的问题的根源。 I suggest you either send the data in ISO-8859-1 or try to communicate to Suomen Verkkomaksut that you're sending UTF-8. 我建议您发送ISO-8859-1中的数据或尝试与Suomen Verkkomaksut沟通您发送的UTF-8。 In a http-based protocol you do this by adding charset=utf-8 to Content-Type in the HTTP header. 在基于http的协议中，您可以通过在HTTP标头中将charset = utf-8添加到Content-Type来实现此目的。

A way to rule out some issues would be to try a prehash String that only contains characters that are encoded the same in UTF-8 and ISO-8859-1. 排除某些问题的一种方法是尝试preshsh字符串，该字符串仅包含在UTF-8和ISO-8859-1中编码相同的字符。 From what I can see you can achieve this by removing all "ä" characters in the string you'e used. 从我所看到的你可以通过删除你使用的字符串中的所有“ä”字符来实现这一点。

MD5在Java中使用ISO-8859-1字符串哈希

问题描述

4 个解决方案

解决方案1
9 2009-12-03 10:43:55

解决方案2
2 已采纳 2009-12-03 12:06:08

解决方案3
2 2011-07-12 09:08:56

解决方案4
1 2009-12-03 10:33:59

MD5在Java中使用ISO-8859-1字符串哈希

问题描述

4 个解决方案

解决方案1 9 2009-12-03 10:43:55

解决方案2 2 已采纳 2009-12-03 12:06:08

解决方案3 2 2011-07-12 09:08:56

解决方案4 1 2009-12-03 10:33:59

解决方案1
9 2009-12-03 10:43:55

解决方案2
2 已采纳 2009-12-03 12:06:08

解决方案3
2 2011-07-12 09:08:56

解决方案4
1 2009-12-03 10:33:59