简体   繁体   中英

Sending unsigned bytes from C#, being received as signed bytes in Java

My program is sending compressed data from C# to Java. In C# the compressed data is returned as a byte[ ] from the Ionic.Zlib.GZipStream.CompressBuffer method of DotNetZip. This byte[ ] is used to construct a MemoryStream which is sent from C#.

In Java the stream is used by the GZIPInputStream.read method and writes to another byte[ ], which is finally used for reconstruction of the original, uncompressed data (which is text for now, but will eventually become images). In my testing I found that for datasets containing very large numbers I was not getting all my data, which led me to learn about the difference between signed and unsigned types. From my Googling everyone seems to be recommending converting to int on the Java side so that the information can be retrieved correctly, but as far as I know there is no such way to use an int[ ] in the construction of a suitable stream for GZIPInputStream. I tried (stupidly) to convert from byte to sbyte in C# but MemoryStream does not take an sbyte[ ] for construction.

What can I do? It seems I am SOL I am actively researching this but I figured I would ask on Stack as well.

For reference, here is an earlier post from me relating to the same project. I managed to work through the problems in that post: Sending gzipped data over a network from C# to Java

EDIT: To clarify, I mentioned that the data is just text, but that it's not working for very large numbers. The numbers are read in as text by C# from Access (the number fields are text fields), sent across as binary compressed data, and then reconstructed as text in Java.

Can you help to clarify. You mention that the original, uncompressed data is text (for now), but later you mention that it doesn't work for very large numbers. Without some more context it's hard to help.

I think it would be worth looking into basic data types a little more, looking into your other question, I think that's confusing matters for you.

byte is the basic unit of currency for binary data. It's 8bits, and can have 256 distinct values... virtually all functions you come across for dealing with binary data will deal with bytes[] (eg gzip compress or uncompress).

char is a single text character. In java and c# it is two bytes and it represents a single unicode character -- basically a letter in every alphabet you can think of (in the range U+0000 to U+FFFF).

If you want to write a string to a binary file (to zip it or across a network) you need to choose how to encode that string. Look through this for some information on String encoding. http://www.joelonsoftware.com/articles/Unicode.html . Your other code example glosses over some of that detail, but I think that it's worth being explicit and it will help to clarify your understanding.

Finally an int is 4 bytes (or 32 bits) in both languages, but the order those bytes go is, again, a choice (endian).

hope that helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM