简体   繁体   English

Java中的等效GetBytes函数,例如c#

[英]Equivalent GetBytes function in Java like c#

I have problem with converting string to bytes in Java when I'm porting my C# library to it. 将C#库移植到Java中时,我在将字符串转换为字节时遇到问题。 It converts the string but it is not the same byte array. 它转换字符串,但不是相同的字节数组。

I use this code in C# 我在C#中使用此代码

string input = "Test ěščřžýáíé 1234";
Encoding encoding = Encoding.UTF8;
byte[] data = encoding.GetBytes(input);

And code in Java 和Java代码

String input = "Test ěščřžýáíé 1234";
String encoding = "UTF8";
byte[] data = input.getBytes(encoding);

Lwft one is Java output and right one is C# how to make Java output same as C# one ? 第一种是Java输出,右一种是C#如何使Java输出与C#相同?

在此处输入图片说明

In likelihood, the byte arrays are the same. 字节数组可能是相同的。 However, if you're formatting them to a string representation (eg to view through a debugger), then they would appear different, since the byte data type is treated as unsigned in C# (having values 0255 ) but signed in Java (values -128127 ). 但是,如果将它们格式化为字符串表示形式(例如,通过调试器查看),则它们会出现不同的情况,因为在C#中byte数据类型被视为未签名(具有值0255 ),但是在Java中已签名(值-128127 )。 Refer to this question and my answer for an explanation. 请参阅此问题我的答案以获取解释。

Edit : Based on this answer , you can print unsigned values in Java using: 编辑 :基于此答案 ,您可以使用以下命令在Java中打印无符号值:

byte b = -60;
System.out.println((short)(b & 0xFF));   // output: 196

These arrays are very probably the same. 这些数组很可能是相同的。

You are hit by a big difference between C# and Java: in Java, byte is unsigned . 您会被C#和Java之间的巨大差异所折服:在Java中, byteunsigned

In order to dump, try this: 为了转储,请尝试此:

public void dumpBytesToStdout(final byte[] array)
{
    for (final byte b: array)
        System.out.printf("%02X\n", b);
}

And do an equivalent dump method in C# (no idea how, I don't do C#) 并在C#中执行等效的转储方法(不知道如何,我不执行C#)

Alternatively, if your dump function involves integer types larger than byte, for instance an int, do: 或者,如果转储函数涉及大于字节的整数类型(例如int),请执行以下操作:

i & 0xff

to remove the sign bits. 删除符号位。 Note that if you cast byte -1, which reads: 请注意,如果强制转换字节-1,则其内容为:

1111 1111

to an int, this will NOT give: 到一个int,这不会给:

0000 0000 0000 0000 0000 0000 1111 1111

but: 但:

1111 1111 1111 1111 1111 1111 1111 1111

ie, the sign bit is "carried" (otherwise, casting would yield int value 255, which is not -1) 即符号位是“ carry”(否则,强制转换将产生int值255,而不是-1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM