简体   繁体   中英

Creating a new String using a byte array is giving a strange result

I am reading a file using the readFully method of RandomAccessFile class, but the results are not what I exactly expected.

This is the simple function which reads the file and returns a new String using the byte array where all the bytes are stored:

public String read(int start)
{
    setFilePointer(start);//Sets the file pointer

    byte[] bytes = new byte[(int) (_file.length() - start)];

    try
    {
        _randomStream.readFully(bytes);
    }
    catch(IOException e)
    {
        e.printStackTrace();
    }

    return new String(bytes);
}

In the main:

public static void main(String[] args)
{
    String newline = System.getProperty("line.separator");

    String filePath = "C:/users/userZ/Desktop/myFile.txt";
    RandomFileManager rfmanager = new RandomFileManager(filePath, FileOpeningMode.READ_WRITE);

    String content = rfmanager.read(10);

    System.out.println("\n"+content);

    rfmanager.closeFile();
}

This function is called in the constructor of the RandomFileManager . It creates the file, if it doesn't exist already.

private void setRandomFile(String filePath, String mode)
{
    try
    {
        _file = new File(filePath);

        if(!_file.exists())
        {

            _file.createNewFile();// Throws IOException
            System.out.printf("New file created.");
        }
        else System.out.printf("A file already exists with that name.");

        _randomStream = new RandomAccessFile(_file, mode);

    }
    catch(IOException e)
    {
        e.printStackTrace();
    }
}

I write to the file using this write method:

public void write(String text)
{
    //You can also write
    if(_mode == FileOpeningMode.READ_WRITE)
    {
        try
        {
            _randomStream.writeChars(text);
        }
        catch(IOException e)
        {
            e.printStackTrace();
        }
    }
    else System.out.printf("%s", "Warning!");
}

Output: 在此处输入图片说明

I used the writeChars method.

This write all characters as UTF-16 which is unlikely to be the default encoding. If you use UTF-16BE character encoding, this will decode the characters. UTF_16 uses two bytes, per character.

If you only need characters between (char) 0 and (char) 255 I suggest using ISO-8859-1 encoding as it will be half the size.

The problem is that you are not specifying a Charset and so the "platform default" is being used. This is almost always a bad idea. Instead, use this constructor: String(byte[], Charset) and be explicit about the encoding the file was written with. Given the output you are showing, it appears to be a two-byte encoding, likely UTF-16BE.

Short answer: bytes are not characters

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM