Java使用来自不同语言的字符串读取文件

Question

I made a program that reads different text files and combines this into a .csv file. 我制作了一个程序，可以读取不同的文本文件并将其合并为.csv文件。 Its a .csv file with translations into English, dutch, french, italian, portuguese and spanish. 它是一个.csv文件，可翻译成英语，荷兰语，法语，意大利语，葡萄牙语和西班牙语。

Now here is my problem: 现在这是我的问题：

In the end i get a nice filled .csv file with all the translations together. 最后，我得到了一个不错的.csv文件，其中包含所有翻译内容。 I read the files with UTF-8 and all the languages get shown right except for the french one. 我使用UTF-8读取文件，除法语外的所有语言均正确显示。 Some chars are shows as Questionmarks like these: "Mis ? jour" and it should be "Mis à jour". 某些字符以如下问号的形式显示：“ Mis？jour”，应为“ Misàjour”。

Here is the method that reads the different files with the different languages and makes objects from them so i can sort them en put them in the right spot in the .csv file 这是一种使用不同语言读取不同文件并从中创建对象的方法，因此我可以对它们进行排序并将它们放在.csv文件中的正确位置

The files are filled like this: 文件填充如下：

To Airport;A l'aéroport 到机场;机场

Today;Aujourd'hui 今天； Aujourd'hui

public static Language getTranslations(String inputFileName) {
    Language language = new Language();

     FileInputStream fstream;
    try {
        fstream = new FileInputStream(inputFileName);

        // Get the object of DataInputStream
        DataInputStream in = new DataInputStream(fstream);
        BufferedReader br = new BufferedReader( new InputStreamReader( new FileInputStream(inputFileName), "UTF-8"));
        String strLine;
        //Read File Line By Line
        while ((strLine = br.readLine()) != null)   {
            // Print the content on the console
            String[] values = strLine.split(";");
            if(values.length == 2) {
                language.putTranslationItem(values[0], values[1]);
            }
    }

      //Close the input stream
    in.close();

    } catch (FileNotFoundException e) {
    } catch (IOException e) {
    }

    return language;
}

I hope anybody can help out! 希望任何人都能帮忙！

Thanks 谢谢

Answer 1

I am not completely sure about this , but you can try to convert the values[0] and values[1] strings into bytearray 我对此不太确定，但是您可以尝试将values [0]和values [1]字符串转换为bytearray

byte[] value_0_utfString = values[0].getBytes("UTF-8") ;
byte[] value_1_utfString = values[1].getBytes("UTF-8") ;

and then convert it back into a string 然后将其转换回字符串

str_0 = new String(value_0_utfString ,"UTF-8") ;
str_1 = new String(value_1_utfString ,"UTF-8") ;

Not sure if this is the right / optimized way , but since a single line comprises of both english and french , I thought splitting and encoding might help , I haven't tried this myself 不确定这是否是正确的/优化的方式，但是由于一行由英语和法语组成，我认为拆分和编码可能会有所帮助，我自己也没有尝试

Java使用来自不同语言的字符串读取文件

问题描述

1 个解决方案

解决方案1
0 2012-01-31 11:09:54

Java使用来自不同语言的字符串读取文件

问题描述

1 个解决方案

解决方案1 0 2012-01-31 11:09:54

解决方案1
0 2012-01-31 11:09:54