简体   繁体   中英

How do I replace illegal characters in a filename?

I am trying to create a zip with folders inside it and I have to sanitize the folder names against any illegal characters. I did some googling around and found this method from http://www.rgagnon.com/javadetails/java-0662.html :

public static String sanitizeFilename(String name) {
    return name.replaceAll("[\\\\/:*?\"<>|]", "-");
}

However, upon testing I get some weird results. For example:

name = filename£/?e>"e

should return filename£--e--e from my understanding. But instead it returns filename-ú--e--e

Why is this so?

Please note that I am testing this by opening the downloaded zip file in WinZip and looking at the folder name that is created. I can't get the pound sign to appear. I've also tried this:

public static String sanitizeFilename(String name) {
    name = name.replaceAll("[£]", "\u00A3");
    return name.replaceAll("[\\\\/:*?\"<>|]", "-");
}

EDIT: Some more research and I found this: http://illegalargumentexception.blogspot.co.uk/2009/04/i18n-unicode-at-windows-command-prompt.html It appears to do with Locale, windows versions and encoding factors. Not sure how I can overcome this within the code.

I think it depends on how you are actually reading the file name in terms of encoding.

Therefore, the £ symbol might get corrupted.

As an example not fitting your case exactly, reading UTF-8-encoded £ as an ISO Latin 1-encoded character would return £ .

Make sure of the file's encoding (ie ISO Latin 1 vs UTF-8 would be the most common), then use the appropriate parameter for your Reader .

As a snippet, you may want to consider this example:

BufferedReader br = new BufferedReader(
    new InputStreamReader(
        new FileInputStream(new File("yourTextFile")), 
        "[your file's encoding]"
    )
);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM