使用掃描儀讀取UTF-8字符

Question

public boolean isValid(String username, String password)  {
        boolean valid = false;
        DataInputStream file = null;

        try{
            Scanner files = new Scanner(new BufferedReader(new FileReader("files/students.txt")));

            while(files.hasNext()){
                System.out.println(files.next());
            }

        }catch(Exception e){
            e.printStackTrace();
        }
        return valid;
    }

當我讀取由UTF-8（由另一個Java程序編寫）的文件時，它為什么顯示帶有奇怪的符號和其字符串名稱的結果？

I wrote it using this

    private static void  addAccount(String username,String password){
        File file = new File(file_name);
        try{
            DataOutputStream dos = new DataOutputStream(new FileOutputStream(file,true));
            dos.writeUTF((username+"::"+password+"\n"));
        }catch(Exception e){

        }
    }

Answer 1

這是一種簡單的方法：

File words = new File(path);
Scanner s = new Scanner(words,"utf-8");

Answer 2

從FileReader Javadoc：

讀取字符文件的便捷類。 此類的構造函數假定默認字符編碼和默認字節緩沖區大小是適當的。 要自己指定這些值，請在FileInputStream上構造一個InputStreamReader。

因此，可能類似於new InputStreamReader(new FileInputStream(file), "UTF-8"))

Answer 3

使用DataOutput.writeUTF / DataInput.readUTF ，前2個字節形成一個無符號的16位大端整數，表示字符串的大小。

首先，讀取兩個字節，並以與readUnsignedShort方法完全相同的方式來構造一個無符號的16位整數。 此整數值稱為UTF長度 ，它指定要讀取的其他字節數。 然后，通過分組考慮將這些字節轉換為字符。 每個組的長度是根據該組的第一個字節的值計算的。 組之后的字節（如果有）是下一組的第一個字節。

這些可能是造成您問題的原因。 您需要跳過前2個字節，然后指定您的Scanner使用UTF-8進行正確讀取。

話雖如此，我看不出有任何理由在這里使用DataOutput / DataInput 。 您只能使用FileReader和FileWriter代替。 這些將使用默認的系統編碼。

使用掃描儀讀取UTF-8字符

問題描述

3 個解決方案

解決方案1
7 2016-05-03 04:06:17

解決方案2
0 2012-08-15 03:07:53

解決方案3
0 2012-08-15 03:35:09

使用掃描儀讀取UTF-8字符

問題描述

3 個解決方案

解決方案1 7 2016-05-03 04:06:17

解決方案2 0 2012-08-15 03:07:53

解決方案3 0 2012-08-15 03:35:09

解決方案1
7 2016-05-03 04:06:17

解決方案2
0 2012-08-15 03:07:53

解決方案3
0 2012-08-15 03:35:09