Java StringBuilder / 字符串是 '<Unreadable> '

Question

I have a byte array bytes of UTF-8 encoded strings which I want to convert to a String.我有一个字节数组bytes的 UTF-8 编码字符串，我想将其转换为字符串。 bytes.length is about 130000 bytes.length 约为 130000

String str = new String(bytes, StandardCharsets.UTF_8); should do the job.应该做的工作。 However str gets the value '<Unreadable>'然而 str 得到值 '<Unreadable>'

Converting bytes line by line and printing it out works nicely.逐行转换字节并打印出来效果很好。 However appending the lines in a StringBuilder fails as well.但是，在 StringBuilder 中添加行也会失败。 Again the content of the StringBuilder r will be '<Unreadable>'.同样，StringBuilder r 的内容将是“<Unreadable>”。 So I thought there might be an unreadable byte in the array.所以我认为数组中可能有一个不可读的字节。 But r.substring(60000, r.count) works well, and r.substring(1,60000) , too.但是r.substring(60000, r.count)效果很好， r.substring(1,60000)也是。 Is there any problem with the size of the byte array??字节数组的大小有问题吗？ Maximum size of String/StringBuilder is 2^32 - 1 so there should be no problem. String/StringBuilder 的最大大小是 2^32 - 1 所以应该没有问题。

      ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
      InputStreamReader reader = new InputStreamReader(bais);
      BufferedReader in = new BufferedReader(reader);
      // String readBuf = in.lines().collect(Collectors.joining()); gives '<Unreadable>'           
      String readed;
      StringBuilder r = new StringBuilder();
      while ((readed = in. readLine()) != null) {
          System.out.println(readed); // works fine
          r=r.append(readed);
      }

After the loop r.toString() is '<Unreadable>' Any ideas why I cannot convert the byte array to a String/StringBuilder?循环后 r.toString() is '<Unreadable>' 为什么我不能将字节数组转换为 String/StringBuilder？

Answer 1

I had exactly the same.我有完全一样的。 The String was "<Unreadable>" if it was made from file bigger then 50000 bytes.如果字符串由大于 50000 字节的文件制成，则该字符串为"<Unreadable>" 。 But String.lenght() showed 50000+ bytes!但是String.lenght()显示 50000+ 字节！ I found that the text "<Unreadable"> returned the IDE (Netbeans 11) in debug tools.我发现文本"<Unreadable">在调试工具中返回了 IDE (Netbeans 11)。 So, the String was OK, but Netbeans didn't show the right content.所以，字符串没问题，但 Netbeans 没有显示正确的内容。 It showed "<Unreadable>" instead.它改为显示"<Unreadable>" 。

Answer 2

I tried to reproduce this behavior, and I have not been able to.我试图重现这种行为，但我无法做到。 We need a proper minimal reproducible example to make any real progress on this.我们需要一个适当的最小可重现示例来在这方面取得任何实际进展。 And full details of the Java version and vendor, and any other tools that may be implicated.以及 Java 版本和供应商的完整详细信息，以及可能涉及的任何其他工具。

However I do have one definite thing to report.但是，我确实有一件明确的事情要报告。 I have copies of the OpenJDK source code for Java 6, 7, 8, 11 and 17 in a searchable form.我有 Java 6、7、8、11 和 17 的 OpenJDK 源代码的可搜索形式的副本。 When I search the source code for Unreadable , NONE of the hits I get are relevant.当我搜索Unreadable的源代码时，我得到的所有命中都没有相关性。 (Indeed, they are all in the respective test trees!) This is very odd. （实际上，它们都在各自的test树中！）这很奇怪。

My tentative conclusion is that this <Unreadable> string you are seeing is NOT coming from OpenJDK / Oracle Java.我的初步结论是，您看到的这个<Unreadable>字符串不是来自 OpenJDK / Oracle Java。 Either you are using a different vendor's Java, or it is coming from a tool such as your IDE.您正在使用不同供应商的 Java，或者它来自诸如您的 IDE 之类的工具。

Java StringBuilder / 字符串是 '<Unreadable> '

问题描述

2 个解决方案

解决方案1
0 2022-05-18 05:28:21

解决方案2
0 2022-05-18 06:36:40

Java StringBuilder / 字符串是 &#39;<Unreadable> &#39;

问题描述

2 个解决方案

解决方案1 0 2022-05-18 05:28:21

解决方案2 0 2022-05-18 06:36:40

Java StringBuilder / 字符串是 '<Unreadable> '

解决方案1
0 2022-05-18 05:28:21

解决方案2
0 2022-05-18 06:36:40