简体   繁体   English

来自Java的DataOutputStream的C#BinaryReader ReadUTF

[英]C# BinaryReader ReadUTF from Java's DataOutputStream

I've been working on converting the classes over to C# ( DataInputStream and DataOutputStream ) and I've finished the DataOutputStream class, now the problems are all sitting in the InputStream class. 我一直在努力将类转换为C#( DataInputStreamDataOutputStream ),并且完成了DataOutputStream类,现在问题都在InputStream类中。

Note: The reason that I'm not using the Encoding class in C# is because DataInput/DataOutputStream in Java use a custom UTF-8 Encoding. 注意:之所以没有在C#中使用Encoding类,是因为Java中的DataInput / DataOutputStream使用了自定义的UTF-8编码。

Basically, I have this code: ( C# ) which uses the BinaryReader class 基本上,我有以下代码:( C# )使用BinaryReader

public String ReadUTF()
    {
        int utflen = this.ReadUnsignedShort ();
        byte[] bytearr = null;
        char[] chararr = null;

        if(bytearr.Length < utflen)
        {
            bytearr = new byte[utflen * 2];
            chararr = new char[utflen * 2];
        }

        int c, char2, char3;
        int count = 0;
        int chararr_count=0;

        this.ReadFully(bytearr, 0, utflen);

        while (count < utflen) {
            c = (int) bytearr[count] & 0xff;
            if (c > 127) break;
            count++;
            chararr[chararr_count++]=(char)c;
        }

        while (count < utflen) {
            c = (int) bytearr[count] & 0xff;
            switch (c >> 4) {
            case 0: case 1: case 2: case 3: case 4: case 5: case 6: case 7:
                /* 0xxxxxxx*/
                count++;
                chararr[chararr_count++]=(char)c;
                break;
            case 12: case 13:
                /* 110x xxxx   10xx xxxx*/
                count += 2;
                if (count > utflen)
                    throw new Exception(
                        "malformed input: partial character at end");
                char2 = (int) bytearr[count-1];
                if ((char2 & 0xC0) != 0x80)
                    throw new Exception(
                        "malformed input around byte " + count);
                chararr[chararr_count++]=(char)(((c & 0x1F) << 6) |
                                                (char2 & 0x3F));
                break;
            case 14:
                /* 1110 xxxx  10xx xxxx  10xx xxxx */
                count += 3;
                if (count > utflen)
                    throw new Exception(
                        "malformed input: partial character at end");
                char2 = (int) bytearr[count-2];
                char3 = (int) bytearr[count-1];
                if (((char2 & 0xC0) != 0x80) || ((char3 & 0xC0) != 0x80))
                    throw new Exception(
                        "malformed input around byte " + (count-1));
                chararr[chararr_count++]=(char)(((c     & 0x0F) << 12) |
                                                ((char2 & 0x3F) << 6)  |
                                                ((char3 & 0x3F) << 0));
                break;
            default:
                /* 10xx xxxx,  1111 xxxx */
                throw new Exception(
                    "malformed input around byte " + count);
            }
        }
        // The number of chars produced may be less than utflen
        return new String(chararr, 0, chararr_count);
    }

here's my ReadUnsignedShort method 这是我的ReadUnsignedShort方法

public int ReadUnsignedShort()
    {
        int ch1 = BinaryReader.Read();
        int ch2 = BinaryReader.Read();
        if ((ch1 | ch2) < 0)
        {
            throw new EndOfStreamException(); // Temp- To be changed
        }
        return (ch1 << 8) + (ch2 << 0); 
    }

Here's the Readfully method too that's used: 这也是使用的Readfully方法:

public void ReadFully(byte[] b, int off, int len)
    {
        if(len < 0)
        {
            throw new IndexOutOfRangeException();
        }

        int n = 0;
        while(n < len) 
        {
            int count = ClientInput.Read(b, off + n, len - n);
            if(count < 0)
            {
                throw new EndOfStreamException(); // Temp - to be changed
            }
            n += count;
        }
    }

With the OutputStream the problem was that I was using the Write(int) instead of the Write(byte) function, but I don't think that's the case here, either that or I must be blind. 使用OutputStream的问题是我使用的是Write(int)而不是Write(byte)函数,但是我认为不是这种情况,要么我要么是瞎子。

If you're interested in how the UTF String is sent, here's the C# Conversion for it: 如果您对如何发送UTF字符串感兴趣,请参见以下C#转换:

public int WriteUTF(string str)
    {
        int strlen = str.Length;
        int utflen = 0;
        int c, count = 0;

        for(int i = 0; i < strlen; i++) 
        {
            c = str.ToCharArray()[i];
            if((c >= 0x0001) && (c <= 0x007F)) 
            {
                utflen++;
            } 
            else if(c > 0x07FF)
            {
                utflen += 3;
            }
            else
            {
                utflen += 2;
            }
        }

        if(utflen > 65535)
        {
            throw new Exception("Encoded string is too long: " + utflen + " bytes");
        }

        byte[] bytearr = null;
        bytearr = new byte[(utflen*2) + 2];

        bytearr[count++] = (byte) (((uint)utflen >> 8) & 0xFF);
        bytearr[count++] = (byte) (((uint)utflen >> 0) & 0xFF);

        int x = 0;
        for(x = 0; x < strlen; x++) 
        {
            c = str.ToCharArray()[x];
            if (!((c >= 0x0001) && (c <= 0x007F))) break;
            bytearr[count++] = (byte)c;
        }

        for(;x < strlen; x++)
        {
            c = str.ToCharArray()[x];
            if ((c >= 0x0001) && (c <= 0x007F)) 
            {
                bytearr[count++] = (byte)c;
            }
            else if (c > 0x07FF)
            {
                bytearr[count++] = (byte) (0xE0 | ((c >> 12) & 0x0F));
                bytearr[count++] = (byte) (0x80 | ((c >>  6) & 0x3F));
                bytearr[count++] = (byte) (0x80 | ((c >>  0) & 0x3F));
            }
            else
            {
                bytearr[count++] = (byte) (0xC0 | ((c >>  6) & 0x1F));
                bytearr[count++] = (byte) (0x80 | ((c >>  0) & 0x3F));
            }
        }
        ClientOutput.Write (bytearr, 0, utflen+2);
        return utflen + 2;
    }

Hopefully I've provided enough information to get a little help with reading the UTF Values, this is really putting a road-block in my progress rate for my project. 希望我提供了足够的信息以帮助您阅读UTF值,这确实为我的项目进度设置了障碍。

If I understand the "question" correctly (such as it is — you say there's a "roadblock" but you fail to explain what exactly the "roadblock" is), you are trying to implement in C# the code to read and write text from the stream. 如果我正确地理解了“问题”(例如,它是-您说有一个“障碍”,但是您无法解释“障碍”的确切含义),则您正在尝试在C#中实现从中读取和写入文本的代码流。 If so, then (and I know if you're new to .NET this isn't immediately obvious) explicitly handling the text encoding yourself is insane. 如果是这样,那么(而且我知道您是.NET的新手,这并不立刻显而易见)显式地处理自己的文本编码是很疯狂的。

BinaryReader and BinaryWriter have methods to handle this. BinaryReader和BinaryWriter具有处理此问题的方法。 When you create the objects, you can pass an Encoding instance (eg System.Text.Encoding.UTF8, System.Text.Encoding.Unicode, etc.) which is used to interpret or create binary data for text. 创建对象时,可以传递一个Encoding实例(例如System.Text.Encoding.UTF8,System.Text.Encoding.Unicode等),该实例用于解释或创建文本的二进制数据。 You can use BinaryReader.ReadChars(int) to read text, and BinaryWriter.Write(char[]) to write text. 您可以使用BinaryReader.ReadChars(int)读取文本,并使用BinaryWriter.Write(char [])编写文本。

If for some reason that doesn't work, at the very least you can use an Encoding instance directly to interpret or create binary data for some text. 如果由于某种原因不起作用,则至少可以直接使用Encoding实例为某些文本解释或创建二进制数据。 Encoding.GetString(byte[]) will convert binary to text, and Encoding.GetBytes(string) will convert text to binary. Encoding.GetString(byte [])将二进制转换为文本,而Encoding.GetBytes(string)将文本转换为二进制。 Again, using a specific Encoding instance for the actual text encoding you're dealing with. 同样,将特定的Encoding实例用于您要处理的实际文本编码。

have written a C# Conversion of Java's DataInputStream and DataOutputStream you can collect them here. 已经编写了Java的DataInputStreamDataOutputStream的C#转换,您可以在这里收集它们。

https://bitbucket.org/CTucker1327/c-datastreams/src https://bitbucket.org/CTucker1327/c-datastreams/src

To construct these classes you would pass a BinaryWriter or BinaryReader into the constructor. 要构造这些类,您可以将BinaryWriter或BinaryReader传递给构造函数。

To Construct DataOutputStream 构造DataOutputStream

DataOutputStream out = new DataOutputStream(new BinaryWriter(Stream));

To Construct DataInputStream 构造DataInputStream

DataInptuStream in = new DataInputStream(new BinaryReader(Stream));

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM