简体   繁体   English

在Java中写入ASCII 0-255的数据类型(ServletOutputStream)

[英]Data type to write ASCII 0-255 in Java (ServletOutputStream)

I have an array "result" that contains values from 0-255. 我有一个数组“结果”,其中包含0-255之间的值。 I originally declared it as byte[], but when I have to write 128, result[i] gets value -128 and in the output file it is written "€" that is finally read as 8364. 我最初将其声明为byte [],但是当我必须写入128时,result [i]的值将为-128,并且在输出文件中将其写入“€”,最后将其读取为8364。

As I can see that byte only accepts values -128 to 127, what data type should I use for values from 0-255 (without wasting memory)? 如我所见,该字节仅接受-128到127的值,对于0-255之间的值,我应该使用哪种数据类型(不浪费内存)?

Should I change as well Content Type or add any charset header? 我应该同时更改“内容类型”还是添加任何字符集标题?

Thanks 谢谢

res.setContentType("application/octet-stream"); 
res.setHeader("Content-Disposition","attachment;filename=output.js");
ServletOutputStream os = res.getOutputStream();
byte[] result=encode(req.getParameter("originalScript")); // Result[i]=-128 (should be 128)
os.write(result,0,result.length); // result[i] on output.js is written as "€" (8364)

You're confused by mixing several concepts. 您混淆了几个概念。

First of all, the int 128 is the same as the byte -128 (int 255 == byte -1, 254 == -2, ... 128 = -128). 首先,int 128与字节-128相同(int 255 ==字节-1,254 == -2,... 128 = -128)。 Bytes are signed and the sign information is in the highest bit. 字节被签名 ,并且符号信息在最高位。 Your mistake here is that you didn't use the correct way to convert the byte value back to an int. 这里的错误是您没有使用正确的方法将字节值转换回int。 To fix this, use this code: 要解决此问题,请使用以下代码:

b = (byte) 128;
int i = b & 0xff;
System.out.println(b);
System.out.println(i);

gives -128 and 128 . 给出-128128

Next: ASCII is only defined for values between 0 and 127. This means anything > 127 is garbage unless you treat it carefully. 下一步:仅为0到127之间的值定义ASCII。这意味着> 127的任何内容都是垃圾,除非您仔细处理。

The problem is when you read the output of your code. 问题是当您阅读代码输出时。 Since ASCII can't contain values > 127, what should the reading code do? 由于ASCII不能包含大于127的值,因此读取代码应该做什么?

"output.js" sounds like you're using a web browser to read this data as a JavaScript file. “ output.js”听起来就像您正在使用网络浏览器以JavaScript文件形式读取此数据一样。 The web browser will try to convert the byte stream into text using an "encoding". Web浏览器将尝试使用“编码”将字节流转换为文本。 You don't specify one, the browser has to make a guess and gets it wrong (and application/octet-stream seems wrong, too. Shouldn't that be text/javascript ?). 您无需指定一个,浏览器就必须猜测并弄错( application/octet-stream似乎是错误的。这不应该是text/javascript吗?)。

You have two options: 您有两种选择:

  1. Change encode() to return a properly encoded UTF-8 string (UTF-8 is a way to send unicode as bytes) and set the charset to UTF-8 (which is usually the default but better be safe than sorry): 更改encode()以返回正确编码的UTF-8字符串(UTF-8是将unicode作为字节发送的一种方式),并将字符集设置为UTF-8 (通常是默认设置,但比后悔更安全):

     response.setHeader("Content-Type", "text/javascript; charset=UTF-8"); 
  2. Set the charset to ISO-8859-1 which will preserve the bytes 1:1. 将字符集设置为ISO-8859-1 ,它将保留字节1:1。 This will fail if your script contains any Unicode characters > 255. Since there won't be an error, you should not use this approach. 如果您的脚本包含的任何Unicode字符> 255,此操作将失败。由于不会出现错误,因此不应使用此方法。 I just mention it for completeness. 我只是为了完整性而提到它。

It is hard to believe that your application has memory requirements that are so strict at this day and age. 很难相信您的应用程序具有当今如此严格的内存要求。

Without questioning your motives any further, here is what you can do: 在不进一步怀疑您的动机的情况下,您可以执行以下操作:

byte[] result=encode(req.getParameter("originalScript"));
char[] tmp = new char[result.length];
for (int i = 0 ; i != result.length ; i++) {
    tmp[i] = (char)(result[i] & 0xFF);
}
os.print(new String(tmp));

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM