简体   繁体   English

来自Base64解码值的字符串已损坏(Java),

[英]String from Base64 decoded value corrupt (Java),

I'm sending a base64 encoded string from a Classic ASP page to a JSP page. 我正在将base64 encoded stringClassic ASP页面发送到JSP页面。 The string was RC4 encrypted, prior to being encoded. 在对字符串进行编码之前,先对其进行RC4加密。

Now, I observed that, within the ASP page, encoding and decoding the string with base64 works fine. 现在,我发现在ASP页面中,使用base64编码和解码字符串可以正常工作。 However, the base64 decoded string on the JSP page is incorrect. 但是, JSP页面上的base64 decoded string不正确。 I also tried decoding the string in Eclipse and got the same results. 我还尝试在Eclipse中解码字符串,并得到相同的结果。 It seems to be related to the character encoding type, but I'm struggling with determining what is precisely the issue. 它似乎与字符编码类型有关,但是我正在努力确定到底是什么问题。

  • base64 encoded string: yOBIc4FY base64编码的字符串: yOBIc4FY
  • base64 decoded string (from ASP page): ÈàHsX (correct) base64解码的字符串(来自ASP页): ÈàHsX (正确)
  • base64 decoded string (from JSP page and Eclipse): ÈàHs?X (incorrect) base64解码的字符串(来自JSP页面和Eclipse): ÈàHs?X (不正确)

Java/JSP code: Java / JSP代码:

import org.apache.commons.codec.binary.Base64;

String base64String = "yOBIc4FY";

byte[] decodedBase64Byte = Base64.decodeBase64(base64String);

// ÈàHs?X
decodedBase64String = new String(decodedBase64Byte, "ISO-8859-1");

// ÈàHs?X
decodedBase64String = new String(decodedBase64Byte, "windows-1252");

// ??Hs?X
decodedBase64String = new String(decodedBase64Byte, "utf-8");

To reiterate, the correct value should be ÈàHsX . 重申一下, 正确的值应该是ÈàHsX I don't understand what is the problem. 我不明白是什么问题。 Any help would be appreciated. 任何帮助,将不胜感激。

Thank you. 谢谢。

Update 更新

Let me expound on this further. 让我进一步阐述这一点。

The RC4 crytographic algorithm in Classic ASP is widely available, so I won't waste real estate posting it here. Classic ASP中的RC4低温算法广泛可用,因此我不会在这里浪费房地产。 However, I will show the base64 encoder/decoder I'm using for `Classic ASP below. 但是,我将在下面显示我用于“经典ASP”的base64 encoder/decoder

For RC4, the plaintext value I'm using is foobar . 对于RC4,我使用的纯文本值为foob​​ar And the key I'm using is test . 我使用的密钥是test Ostensibly, decoding the base64 string should return the cipher. 表面上看,解码base64字符串应该返回密码。 And decrypting the cipher should return the plaintext value. 解密密码应返回纯文本值。

' Functions to provide encoding/decoding of strings with Base64.
' 
' Encoding: myEncodedString = base64_encode( inputString )
' Decoding: myDecodedString = base64_decode( encodedInputString )
'
' Programmed by Markus Hartsmar for ShameDesigns in 2002. 
' Email me at: mark@shamedesigns.com
' Visit our website at: http://www.shamedesigns.com/
'

    Dim Base64Chars
    Base64Chars =   "ABCDEFGHIJKLMNOPQRSTUVWXYZ" & _
            "abcdefghijklmnopqrstuvwxyz" & _
            "0123456789" & _
            "+/"


    ' Functions for encoding string to Base64
    Public Function base64_encode( byVal strIn )
        Dim c1, c2, c3, w1, w2, w3, w4, n, strOut
        For n = 1 To Len( strIn ) Step 3
            c1 = Asc( Mid( strIn, n, 1 ) )
            c2 = Asc( Mid( strIn, n + 1, 1 ) + Chr(0) )
            c3 = Asc( Mid( strIn, n + 2, 1 ) + Chr(0) )
            w1 = Int( c1 / 4 ) : w2 = ( c1 And 3 ) * 16 + Int( c2 / 16 )
            If Len( strIn ) >= n + 1 Then 
                w3 = ( c2 And 15 ) * 4 + Int( c3 / 64 ) 
            Else 
                w3 = -1
            End If
            If Len( strIn ) >= n + 2 Then 
                w4 = c3 And 63 
            Else 
                w4 = -1
            End If
            strOut = strOut + mimeencode( w1 ) + mimeencode( w2 ) + _
                      mimeencode( w3 ) + mimeencode( w4 )
        Next
        base64_encode = strOut
    End Function

    Private Function mimeencode( byVal intIn )
        If intIn >= 0 Then 
            mimeencode = Mid( Base64Chars, intIn + 1, 1 ) 
        Else 
            mimeencode = ""
        End If
    End Function    


    ' Function to decode string from Base64
    Public Function base64_decode( byVal strIn )
        Dim w1, w2, w3, w4, n, strOut
        For n = 1 To Len( strIn ) Step 4
            w1 = mimedecode( Mid( strIn, n, 1 ) )
            w2 = mimedecode( Mid( strIn, n + 1, 1 ) )
            w3 = mimedecode( Mid( strIn, n + 2, 1 ) )
            w4 = mimedecode( Mid( strIn, n + 3, 1 ) )
            If w2 >= 0 Then _
                strOut = strOut + _
                    Chr( ( ( w1 * 4 + Int( w2 / 16 ) ) And 255 ) )
            If w3 >= 0 Then _
                strOut = strOut + _
                    Chr( ( ( w2 * 16 + Int( w3 / 4 ) ) And 255 ) )
            If w4 >= 0 Then _
                strOut = strOut + _
                    Chr( ( ( w3 * 64 + w4 ) And 255 ) )
        Next
        base64_decode = strOut
    End Function

    Private Function mimedecode( byVal strIn )
        If Len( strIn ) = 0 Then 
            mimedecode = -1 : Exit Function
        Else
            mimedecode = InStr( Base64Chars, strIn ) - 1
        End If
    End Function

From within ASP, the plaintext value is correctly realized from the cipher: 在ASP中,可以通过密码正确实现明文值:

plainText: foobar plainText:foobar

cipherText: ÈàHsX 密文:ÈàHsX

base64 String: yOBIc4FY base64字符串:yOBIc4FY

Decoded base64 String: ÈàHsX 解码的base64字符串:ÈàHsX

Decrypted text: foobar 解密文本:foobar

However, passing the cipher as a base64 string to JSP/Java, the JSP/Java looks like this: 但是,将密码作为base64字符串传递给JSP / Java,JSP / Java看起来像这样:

plainText: foobar (from ASP) plainText:foobar(来自ASP)

cipherText: ÈàHsX (from ASP) 密文:ÈàHsX(来自ASP)

base64 String: yOBIc4FY base64字符串:yOBIc4FY

Decoded base64 String: ÈàHs?X 解码的base64字符串:ÈàHs?X

Decrypted text: foobßr 解密文本:foobßr

So, something is not adding up right here. 因此,此处未加任何内容。 In fact, for Java, making one change in how I decrypt the decipher returns the proper decrypted text of foobar . 实际上,对于Java,对解密方式进行一处更改会返回foobar正确解密的文本。 The RC4 decryption in the Java code takes the cipher in the form of type int[] . Java代码中的RC4解密采用int[]类型的密码。

public int[] decrypt(int[] ciphertext, byte[] key) throws Exception {
    return encrypt(ciphertext, key);
}

In other words, I have to convert the cipher from type String to type int[] . 换句话说,我必须将密码从String类型转换为int[]类型。 I use the function below to do just that: 我使用下面的功能来做到这一点:

public static int[] convertToIntArray(byte[] input)
{
    int[] ret = new int[input.length];
    for (int i = 0; i < input.length; i++)
    {
        ret[i] = input[i] & 0xff; // Range 0 to 255
    }
    return ret;
}

I have two choices. 我有两个选择。 I can decode the base64 string as type byte[] and decrypt, which will return foobar . 我可以将base64字符串解码为byte[]类型并解密,这将返回foobar

String base64String = "yOBIc4FY";

byte[] decodedBase64Byte = Base64.decodeBase64(base64String);

int[] cipheredText =  convertToIntArray(decodedBase64Byte);

Or, I can decode the base64 string as type byte[] and then convert it to type String and back again to type byte[] to decrpyt, which will return foobßr . 或者,我可以将base64字符串解码为byte[]类型,然后将其转换为String类型,然后再次转换为byte[]类型为decrpyt,这将返回foobßr

String base64String = "yOBIc4FY";

byte[] decodedBase64Byte = Base64.decodeBase64(base64String);

// ÈàHs?X
String decodedBase64String = new String(decodedBase64Byte, "ISO-8859-1");

int[] cipheredText =  convertToIntArray(decodedBase64String.getBytes());

My guess is then the original byte sequence is correct, since the RC4 decryption function successfully returns foobar . 我的猜测是原始字节序列是正确的,因为RC4解密函数成功返回foobar However, when I convert the byte sequence to a String of some character encoding set, it changes the value, ultimately with a decrypted value of foobßr . 但是,当我将字节序列转换为某个字符编码集的String时,它将更改该值,最终将其解密后的值为foobßr

It still doesn't make sense then why ASP and JSP/Java are reporting slightly different cipher values? 那么,为什么ASP和JSP / Java报告的密码值略有不同,这还是没有意义的? ASP has no trouble decoding the base64 string or the cipher back into its plaintext value. ASP可以毫不费力地将base64字符串或密码解码回其明文值。 I can't tell if the issue is with ASP, JSP, or both. 我无法确定问题出在ASP,JSP还是两者。

The correct decoding of yOBIc4FY is 6 bytes, specifically: yOBIc4FY的正确解码为6个字节,具体来说:

c8 e0 48 73 81 58

The ÈàHsX value probably just ignores the character 0x81 as unprintable. ÈàHsX值可能只是将字符0x81视为不可打印。

Proof: 证明:

y      O      B      I      c      4      F      Y
110010 001110 000001 001000 011100 111000 000101 011000

11001000 11100000 01001000 01110011 10000001 01011000
c8       e0       48       73       81       58

To address your follow-up question - you should use the byte array you get from the base64 decoder. 要解决您的后续问题-您应该使用从base64解码器获得的字节数组。 Convert it to int[] if you need, but don't create a String out of it, because the encoding will mess it up: 如果需要的话,将其转换为int[] ,但不要用它创建一个String ,因为编码会把它弄乱:

static void printByteArray(byte[] bytes) {
    for (byte b : bytes) {
        System.out.print(Integer.toHexString(b & 0xff) + ", ");
    }
    System.out.println();
}

public static void main(String[] args) {

    byte[] cipherBytes = Base64.getDecoder().decode("yOBIc4FY");
    printByteArray(cipherBytes); // c8, e0, 48, 73, 81, 58 - correct

    cipherBytes = new String(cipherBytes).getBytes();
    printByteArray(cipherBytes); // c8, e0, 48, 73, 3f, 58 - wrong
    // your results may vary depending on your default charset,
    // these are for windows-1250
}

Here you can see that the original correct byte 0x81 was changed into a question mark ? 在这里您可以看到原来正确的字节0x81变成了问号? (byte 0x3f ), because 0x81 doesn't represent a valid character in the charset used when creating the String from the byte array. (字节0x3f ),因为从字节数组创建String时,在字符集中使用0x81表示无效字符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM