[英]Java read file with strings from different languages
I made a program that reads different text files and combines this into a .csv file. 我制作了一个程序,可以读取不同的文本文件并将其合并为.csv文件。 Its a .csv file with translations into English, dutch, french, italian, portuguese and spanish. 它是一个.csv文件,可翻译成英语,荷兰语,法语,意大利语,葡萄牙语和西班牙语。
Now here is my problem: 现在这是我的问题:
In the end i get a nice filled .csv file with all the translations together. 最后,我得到了一个不错的.csv文件,其中包含所有翻译内容。 I read the files with UTF-8 and all the languages get shown right except for the french one. 我使用UTF-8读取文件,除法语外的所有语言均正确显示。 Some chars are shows as Questionmarks like these: "Mis ? jour" and it should be "Mis à jour". 某些字符以如下问号的形式显示:“ Mis?jour”,应为“ Misàjour”。
Here is the method that reads the different files with the different languages and makes objects from them so i can sort them en put them in the right spot in the .csv file 这是一种使用不同语言读取不同文件并从中创建对象的方法,因此我可以对它们进行排序并将它们放在.csv文件中的正确位置
The files are filled like this: 文件填充如下:
To Airport;A l'aéroport 到机场;机场
Today;Aujourd'hui 今天; Aujourd'hui
public static Language getTranslations(String inputFileName) {
Language language = new Language();
FileInputStream fstream;
try {
fstream = new FileInputStream(inputFileName);
// Get the object of DataInputStream
DataInputStream in = new DataInputStream(fstream);
BufferedReader br = new BufferedReader( new InputStreamReader( new FileInputStream(inputFileName), "UTF-8"));
String strLine;
//Read File Line By Line
while ((strLine = br.readLine()) != null) {
// Print the content on the console
String[] values = strLine.split(";");
if(values.length == 2) {
language.putTranslationItem(values[0], values[1]);
}
}
//Close the input stream
in.close();
} catch (FileNotFoundException e) {
} catch (IOException e) {
}
return language;
}
I hope anybody can help out! 希望任何人都能帮忙!
Thanks 谢谢
I am not completely sure about this , but you can try to convert the values[0] and values[1] strings into bytearray 我对此不太确定,但是您可以尝试将values [0]和values [1]字符串转换为bytearray
byte[] value_0_utfString = values[0].getBytes("UTF-8") ;
byte[] value_1_utfString = values[1].getBytes("UTF-8") ;
and then convert it back into a string 然后将其转换回字符串
str_0 = new String(value_0_utfString ,"UTF-8") ;
str_1 = new String(value_1_utfString ,"UTF-8") ;
Not sure if this is the right / optimized way , but since a single line comprises of both english and french , I thought splitting and encoding might help , I haven't tried this myself 不确定这是否是正确的/优化的方式,但是由于一行由英语和法语组成,我认为拆分和编码可能会有所帮助,我自己也没有尝试
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.