來自Java的DataOutputStream的C＃BinaryReader ReadUTF

Question

我一直在努力將類轉換為C＃（ DataInputStream和DataOutputStream ），並且完成了DataOutputStream類，現在問題都在InputStream類中。

注意：之所以沒有在C＃中使用Encoding類，是因為Java中的DataInput / DataOutputStream使用了自定義的UTF-8編碼。

基本上，我有以下代碼：（ C＃）使用BinaryReader類

public String ReadUTF()
    {
        int utflen = this.ReadUnsignedShort ();
        byte[] bytearr = null;
        char[] chararr = null;

        if(bytearr.Length < utflen)
        {
            bytearr = new byte[utflen * 2];
            chararr = new char[utflen * 2];
        }

        int c, char2, char3;
        int count = 0;
        int chararr_count=0;

        this.ReadFully(bytearr, 0, utflen);

        while (count < utflen) {
            c = (int) bytearr[count] & 0xff;
            if (c > 127) break;
            count++;
            chararr[chararr_count++]=(char)c;
        }

        while (count < utflen) {
            c = (int) bytearr[count] & 0xff;
            switch (c >> 4) {
            case 0: case 1: case 2: case 3: case 4: case 5: case 6: case 7:
                /* 0xxxxxxx*/
                count++;
                chararr[chararr_count++]=(char)c;
                break;
            case 12: case 13:
                /* 110x xxxx   10xx xxxx*/
                count += 2;
                if (count > utflen)
                    throw new Exception(
                        "malformed input: partial character at end");
                char2 = (int) bytearr[count-1];
                if ((char2 & 0xC0) != 0x80)
                    throw new Exception(
                        "malformed input around byte " + count);
                chararr[chararr_count++]=(char)(((c & 0x1F) << 6) |
                                                (char2 & 0x3F));
                break;
            case 14:
                /* 1110 xxxx  10xx xxxx  10xx xxxx */
                count += 3;
                if (count > utflen)
                    throw new Exception(
                        "malformed input: partial character at end");
                char2 = (int) bytearr[count-2];
                char3 = (int) bytearr[count-1];
                if (((char2 & 0xC0) != 0x80) || ((char3 & 0xC0) != 0x80))
                    throw new Exception(
                        "malformed input around byte " + (count-1));
                chararr[chararr_count++]=(char)(((c     & 0x0F) << 12) |
                                                ((char2 & 0x3F) << 6)  |
                                                ((char3 & 0x3F) << 0));
                break;
            default:
                /* 10xx xxxx,  1111 xxxx */
                throw new Exception(
                    "malformed input around byte " + count);
            }
        }
        // The number of chars produced may be less than utflen
        return new String(chararr, 0, chararr_count);
    }

這是我的ReadUnsignedShort方法

public int ReadUnsignedShort()
    {
        int ch1 = BinaryReader.Read();
        int ch2 = BinaryReader.Read();
        if ((ch1 | ch2) < 0)
        {
            throw new EndOfStreamException(); // Temp- To be changed
        }
        return (ch1 << 8) + (ch2 << 0); 
    }

這也是使用的Readfully方法：

public void ReadFully(byte[] b, int off, int len)
    {
        if(len < 0)
        {
            throw new IndexOutOfRangeException();
        }

        int n = 0;
        while(n < len) 
        {
            int count = ClientInput.Read(b, off + n, len - n);
            if(count < 0)
            {
                throw new EndOfStreamException(); // Temp - to be changed
            }
            n += count;
        }
    }

使用OutputStream的問題是我使用的是Write（int）而不是Write（byte）函數，但是我認為不是這種情況，要么我要么是瞎子。

如果您對如何發送UTF字符串感興趣，請參見以下C＃轉換：

public int WriteUTF(string str)
    {
        int strlen = str.Length;
        int utflen = 0;
        int c, count = 0;

        for(int i = 0; i < strlen; i++) 
        {
            c = str.ToCharArray()[i];
            if((c >= 0x0001) && (c <= 0x007F)) 
            {
                utflen++;
            } 
            else if(c > 0x07FF)
            {
                utflen += 3;
            }
            else
            {
                utflen += 2;
            }
        }

        if(utflen > 65535)
        {
            throw new Exception("Encoded string is too long: " + utflen + " bytes");
        }

        byte[] bytearr = null;
        bytearr = new byte[(utflen*2) + 2];

        bytearr[count++] = (byte) (((uint)utflen >> 8) & 0xFF);
        bytearr[count++] = (byte) (((uint)utflen >> 0) & 0xFF);

        int x = 0;
        for(x = 0; x < strlen; x++) 
        {
            c = str.ToCharArray()[x];
            if (!((c >= 0x0001) && (c <= 0x007F))) break;
            bytearr[count++] = (byte)c;
        }

        for(;x < strlen; x++)
        {
            c = str.ToCharArray()[x];
            if ((c >= 0x0001) && (c <= 0x007F)) 
            {
                bytearr[count++] = (byte)c;
            }
            else if (c > 0x07FF)
            {
                bytearr[count++] = (byte) (0xE0 | ((c >> 12) & 0x0F));
                bytearr[count++] = (byte) (0x80 | ((c >>  6) & 0x3F));
                bytearr[count++] = (byte) (0x80 | ((c >>  0) & 0x3F));
            }
            else
            {
                bytearr[count++] = (byte) (0xC0 | ((c >>  6) & 0x1F));
                bytearr[count++] = (byte) (0x80 | ((c >>  0) & 0x3F));
            }
        }
        ClientOutput.Write (bytearr, 0, utflen+2);
        return utflen + 2;
    }

希望我提供了足夠的信息以幫助您閱讀UTF值，這確實為我的項目進度設置了障礙。

Answer 1

如果我正確地理解了“問題”（例如，它是-您說有一個“障礙”，但是您無法解釋“障礙”的確切含義），則您正在嘗試在C＃中實現從中讀取和寫入文本的代碼流。 如果是這樣，那么（而且我知道您是.NET的新手，這並不立刻顯而易見）顯式地處理自己的文本編碼是很瘋狂的。

BinaryReader和BinaryWriter具有處理此問題的方法。 創建對象時，可以傳遞一個Encoding實例（例如System.Text.Encoding.UTF8，System.Text.Encoding.Unicode等），該實例用於解釋或創建文本的二進制數據。 您可以使用BinaryReader.ReadChars（int）讀取文本，並使用BinaryWriter.Write（char []）編寫文本。

如果由於某種原因不起作用，則至少可以直接使用Encoding實例為某些文本解釋或創建二進制數據。 Encoding.GetString（byte []）將二進制轉換為文本，而Encoding.GetBytes（string）將文本轉換為二進制。 同樣，將特定的Encoding實例用於您要處理的實際文本編碼。

Answer 2

已經編寫了Java的DataInputStream和DataOutputStream的C＃轉換，您可以在這里收集它們。

https://bitbucket.org/CTucker1327/c-datastreams/src

要構造這些類，您可以將BinaryWriter或BinaryReader傳遞給構造函數。

構造DataOutputStream

DataOutputStream out = new DataOutputStream(new BinaryWriter(Stream));

構造DataInputStream

DataInptuStream in = new DataInputStream(new BinaryReader(Stream));

來自Java的DataOutputStream的C＃BinaryReader ReadUTF

問題描述

2 個解決方案

解決方案1
1 2014-10-17 02:53:19

解決方案2
-2 已采納 2014-10-17 10:05:41

來自Java的DataOutputStream的C＃BinaryReader ReadUTF

問題描述

2 個解決方案

解決方案1 1 2014-10-17 02:53:19

解決方案2 -2 已采納 2014-10-17 10:05:41

解決方案1
1 2014-10-17 02:53:19

解決方案2
-2 已采納 2014-10-17 10:05:41