簡體   English   中英

RandomAccessFile閱讀西里爾文UTF-8 java

[英]RandomAccessFile reading Cyrillic UTF-8 java

隊友!

我無法使用RandomAccessFile讀取文件西里爾文本。

這是一個簡單的程序,使用以下格式將信息寫入特定文件(西里爾字):

keyLength, valueLength, key, value

然后程序嘗試讀取此信息,但我的輸出不正確:

writing success
keyLength = 10, valueLength = 4
read: килло, гр

UPD預期產量:

writing success
keyLength = 10, valueLength = 4
read: киллограмм, сала

問題是什么? (除了小腦的問題)

import java.io.FileNotFoundException;
import java.io.RandomAccessFile;
import java.io.IOException;

public class Main {

    public static void main(String[] args) {
        String fileName = "file.db";
        RandomAccessFile outputFile = null;

        try {
            outputFile = new RandomAccessFile(fileName, "rw");
        } catch (FileNotFoundException e) {
            System.err.println(e.getMessage());
            System.exit(1);
        }

        String key = "киллограмм";
        String value = "сала";

        try {
            outputFile.writeInt(key.length());
            outputFile.writeInt(value.length());

            outputFile.write(key.getBytes("UTF-8"));
            outputFile.write(value.getBytes("UTF-8"));
        } catch (IOException e) {
            System.err.println(e.getMessage());
            System.exit(1);
        }

        System.out.println("writing success");

        RandomAccessFile inputFile = null;

        try {
            inputFile = new RandomAccessFile(fileName, "r");
        } catch (FileNotFoundException e) {
            System.err.println(e.getMessage());
            System.exit(1);
        }

        int keyLength = 0, valueLength = 0;

        try {
            keyLength = inputFile.readInt();
            valueLength = inputFile.readInt();
        } catch (IOException e) {
            System.err.println(e.getMessage());
        }

        System.out.println("keyLength = " + keyLength + ", valueLength = " + valueLength);
        if (keyLength <= 0 || valueLength <= 0) {
            System.err.println("key or value length is negative");
            System.exit(1);
        }

        byte[] keyBytes = null, valueBytes = null;

        try {
            keyBytes = new byte[keyLength];
            valueBytes = new byte[valueLength];
        } catch (OutOfMemoryError e) {
            System.err.println(e.getMessage());
            System.exit(1);
        }

        try {
            inputFile.read(keyBytes);
            inputFile.read(valueBytes);
        } catch (IOException e) {
            System.err.println(e.getMessage());
            System.exit(1);
        }

        try {
            System.out.println("read: " + new String(keyBytes, "UTF-8") + ", " + new String(valueBytes, "UTF-8"));
        } catch (IOException e) {
            System.err.println(e.getMessage());
            System.exit(1);
        }

    }
}

問題是這個

outputFile.writeInt(key.length());

String#length()

返回此字符串的長度。 長度等於字符串中Unicode代碼單元的數量。

在這種情況下,它返回值10 ,它不是表示此String所需的字節數。

你想要的是什么

key.getBytes("UTF-8").length

用作

byte[] keyBytes = key.getBytes("UTF-8");
outputFile.writeInt(keyBytes.length);

value相同。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM