Here is the test.properties file.
mycharacters=ýþÿƛƸ
myotherchars=\u00FD\u00FE\u00FF\u019B\u01B8
Here is the code being used :
import java.awt.FlowLayout;
import java.io.ByteArrayInputStream;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.nio.charset.Charset;
import java.util.ResourceBundle;
import javax.swing.*;
public class MultiByteTest2
{
public MultiByteTest2()
{
ResourceBundle bundle = ResourceBundle.getBundle("test");
JFrame frame = new JFrame("MultiByte Test");
JPanel panel = new JPanel();
panel.setLayout(new FlowLayout());
JLabel label1 = new JLabel(bundle.getString("mycharacters"));
JLabel label2 = new JLabel(" --- " + bundle.getString("myotherchars"));
panel.add(label1);
panel.add(label2);
String defaultCharacterEncoding = System.getProperty("file.encoding");
System.out.println("defaultCharacterEncoding by property: " + defaultCharacterEncoding);
System.out.println("defaultCharacterEncoding by code: " + getDefaultCharEncoding());
System.out.println("defaultCharacterEncoding by charSet: " + Charset.defaultCharset());
frame.add(panel);
frame.setSize(300, 300);
frame.setLocationRelativeTo(null);
frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
frame.setVisible(true);
}
public static void main(String s[])
{
MultiByteTest2 myObject = new MultiByteTest2();
}
public static String getDefaultCharEncoding(){
byte [] bArray = {'w'};
InputStream is = new ByteArrayInputStream(bArray);
InputStreamReader reader = new InputStreamReader(is);
String defaultCharacterEncoding = reader.getEncoding();
return defaultCharacterEncoding;
}
}
Here is the output :
Command to run the above code and the output which shows UTF-8 being used.
>java -Dfile.encoding=UTF-8 MultiByteTest2
Picked up _JAVA_OPTIONS: -Dfile.encoding=UTF-8
defaultCharacterEncoding by property: UTF-8
defaultCharacterEncoding by code: UTF8
defaultCharacterEncoding by charSet: UTF-8
Three questions :
Why does using the actual characters result in a mess of characters being output?
Why does using the Unicode representation work?
The output shows UTF-8 instead of cp1252 which indicates the file.encoding is being used, but why does it not help when using the actual characters in the properties file?
*.properties use ISO-8859-1, Latin-1. This is a very old design decision. By u-escaping Unicode can be read.
I think the cleanest solution would be to use the Properties class, and maybe XML properties ( loadFromXML
). The XML could also be held outside the application, which for internationalisation can be a usefull.
One could also in a maven build convert pre-build *.properties in UTF-8 to u-escaped *.properties. This is a maven copy with filtering.
Instead of *.properties, a PropertyResourceBundle, you could also use a ListResourceBundle, a java class containing an array of texts. The resource path in ResBundle can be slightly different wrt period/slash, but that would free one from the encoding, as you can use the IDE project encoding.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.