简体   繁体   中英

In UTF-8 encoded code, use a string with accented characters taken from a file encoded in ISO-8859-1

Very similar questions have been asked but I couldn't find a solution to my problem.

I have a properties file, ie config.properties , encoded in ISO-8859-1 with the following:

config1 = some value with âccénted characters

I have a class that loads the properties and a method to get a property value

public class EnvConfig {
    private static final Properties properties = new Properties();

    static {        
        initPropertiesFromFile();
    }

    private static void initPropertiesFromFile() {
        InputStream stream;

        try {
            stream = EnvConfig.class.getResourceAsStream("/config/config.properties");
            properties.load(new InputStreamReader(stream, Charset.forName("ISO-8859-1")));
            // Tried that as well instead of the previous line: properties.load(stream);
        } catch (Exception e) {
            // Do something
        } finally {
            stream.close();
        }
    }

    public static String getProperty(String key, String defaultValue) {
        try {
            System.out.println(Charset.defaultCharset()); // Prints UTF-8
            // return new String(properties.getProperty(key).getBytes("ISO-8859-1")); // Returns some value with �cc�nted characters
            // return new String(properties.getProperty(key).getBytes("UTF-8")); // Returns some value with �cc�nted characters
            // return new String(properties.getProperty(key).getBytes("ISO-8859-1"), "UTF-8") // Returns some value with �cc�nted characters
            return properties.getProperty(key, defaultValue); // Returns some value with �cc�nted characters
        } catch (Exception e) {
            // Do something
            return defaultValue;
        }
    }
}

I have code that does something with the property value (String) and the code needs the correct String with accents: some value with âccénted characters

public void doSomething() {
    ...
    EnvConfig.getProperty("config1"); // I need the exact same value as configured in the properties file: some value with âccénted characters; currently get some value with �cc�nted characters
    ...
}

The project is in UTF-8 (Java files are encoded in UTF-8) and project properties/settings (pom) are set to UTF-8.

What am I missing, how can I achieve this? I know there is no such thing as "String in UTF-8 format", since a String is just a sequence of UTF-16 code units. BUT how can I simply have the same "workable" output, the String with accents, as configured in the ISO-8859-1 encoded properties file, in my UTF-8 encoded code/project?

After hours of searching, it turns out that my encoding issue is caused by resources filtering set to true in the project's POM:

    <resources>
        <resource>
            <directory>src/main/resources</directory>
            <filtering>true</filtering>
        </resource>
    </resources>

Setting this to false fixes the issue. I still need to find a way to make it work with filtering enabled so I'll try to figure it out. There are some clues in other questions/answers like Wrong encoding after activating resource filtering . Thanks.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM