为什么在Java中使用byteBuffer将char写入大文件时为什么会得到额外的char'^ @'

Question

我正在尝试将chars写入文件，不确定为什么写^@

^@1^@:^@1^@ ^@2^@ ^@3^@ ^@3^@0^@4^@

这是预期的输出

1:1 2 3 3 0 4

有趣的是，对于较小的文件输出（大约几百行长），我没有这种奇怪的行为。

但是，当输出达到100000+行时，只有我注意到这种奇怪的行为。

这是我的代码片段

final static int charByteSize= 2; // 1 char =2 bytes

writeTofile(FileChannel fc, ResultClass result) throws IOException {

        int key= result.getKey();
        List<Integer> values= result.getValues();
           StringBuilder sb=new StringBuilder();        
        sb.append(key+":");
        for(int value:values)
        {
            sb.append(value+" "); // space delimited value list
        }

        String stringToWrite=sb.toString().trim()+"\n"; //add newline char in end
        char[] arrToWrite=stringToWrite.toCharArray();

        ByteBuffer buf = ByteBuffer.allocate(arrToWrite.length*charByteSize);

        for(char theChar: arrToWrite)
        {
            buf.putChar(theChar);
        }

        buf.flip();     
        fc.write(buf);

}

这是调用函数伪代码 ，以备不时之需

public static void main(String args[])
{
         RandomAccessFile bfc = new RandomAccessFile(theFile, "rw");
         FileChannel fc = bfc.getChannel();    

           for() // run this loop 100000+ times
           {
            ResultClass result= getResultAfterSomeComplexCalculation();  
            writeTofile(fc,result);
           }


           fc.close();
           bfc.close

}

Answer 1

// 1 char =2 bytes

不，不是！ 明智的存储是正确的； 但在其他各个方面，这都是错误的。 char只是Java中字符的基本存储单元； 更确切地说，它是一个UTF-16代码单元。 请注意，补充Unicode字符（U + 10000及更高版本）需要两个字符。

您存储在文件中的不是chars，而是bytes 。 这意味着您首先需要将字符串编码为字节数组； 例如：

final byte[] array = theString.getBytes("UTF-8");

然后将这些字节写入输出文件。

为什么在Java中使用byteBuffer将char写入大文件时为什么会得到额外的char'^ @'

问题描述

1 个解决方案

解决方案1
1 已采纳 2014-04-17 21:39:03

为什么在Java中使用byteBuffer将char写入大文件时为什么会得到额外的char&#39;^ @&#39;

问题描述

1 个解决方案

解决方案1 1 已采纳 2014-04-17 21:39:03

为什么在Java中使用byteBuffer将char写入大文件时为什么会得到额外的char'^ @'

解决方案1
1 已采纳 2014-04-17 21:39:03