简体   繁体   English

从Java内部创建新文件时,无法正确使用拉丁字符。 文件名使用奇怪的字符而不是正确的字符

[英]Can't use latin characters correctly when creating new files from within Java. Filenames get weird characters instead of the correct ones

Currently saving an int[] from hashmap in a file with the name of the key to the int[]. 当前从哈希图中将int []保存到文件中,该文件的名称为int []的键。 This exact key must be reachable from another program. 此确切密钥必须可以从另一个程序访问。 Hence I can't switch name of the files to english only chars. 因此,我无法将文件名切换为仅英语字符。 But even though I use ISO_8859_1 as the charset for the filenames the files get all messed up in the file tree. 但是,即使我使用ISO_8859_1作为文件名的字符集,文件也会在文件树中混乱不堪。 The english letters are correct but not the special ones. 英文字母是正确的,但不是特殊的。

        /**
        * Save array to file
        */
        public void saveStatus(){
            try {
                for(String currentKey : hmap.keySet()) {
                    byte[] currentKeyByteArray = currentKey.getBytes();
                    String bytesString = new String(currentKeyByteArray, StandardCharsets.ISO_8859_1);
                    String fileLocation = "/var/tmp/" + bytesString + ".dat";
                    FileOutputStream saveFile = new FileOutputStream(fileLocation);
                    ObjectOutputStream out = new ObjectOutputStream(saveFile);
                    out.writeObject(hmap.get(currentKey));
                    out.close();
                    saveFile.close();
                    System.out.println("Saved file at " + fileLocation);
                }
            } catch (IOException e) {
                e.printStackTrace();
            }
        }

Could it have to do with how linux is encoding characters or is more likely to do with the Java code? 它可能与linux如何编码字符有关,还是与Java代码有关?

EDIT 编辑

I think the problem lies with the OS. 我认为问题出在操作系统上。 Because when looking at text files with cat for example the problem is the same. 因为例如当用cat查看文本文件时,问题是相同的。 However vim is able to decode the letters correctly. 但是vim能够正确解码字母。 In that case I would have to perhaps change the language settings from the terminal? 在那种情况下,我可能必须在终端上更改语言设置?

You have to change the charset in the getBytes function as well. 您还必须在getBytes函数中更改字符集。

currentKey.getBytes(StandardCharsets.ISO_8859_1);

Also, why are you using StandardCharsets.ISO_8859_1 ? 另外,为什么要使用StandardCharsets.ISO_8859_1 To accept a wider range of characters, use StandardCharsets.UTF_8 . 要接受更大范围的字符,请使用StandardCharsets.UTF_8

The valid characters of a filename or path vary depending on the file system used. 文件名或路径的有效字符取决于所使用的文件系统。 While it should be possible to just use a java string as filename (as long as it does not contain characters invalid in the given file system), there might be interoperability issues and bugs. 尽管应该可以仅使用Java字符串作为文件名(只要它不包含给定文件系统中的无效字符),但可能存在互操作性问题和错误。

In other words, leave out all Charset-magic as @RealSkeptic recommends and it should work. 换句话说,请忽略所有@@ RealSkeptic建议的Charset-magic,它应该可以工作。 But changing the environment might result in unexpected behavior. 但是更改环境可能会导致意外行为。

Depending on your requirements, you might therefore want to encode the key to make sure it only uses a reduced character set. 因此,根据您的要求,您可能需要对密钥进行编码以确保其仅使用精简字符集。 One variant of Base64 might work (assuming your file system is case sensitive!). Base64的一种变体可能会起作用(假设您的文件系统区分大小写!)。 You might even find a library (Apache Commons?) offering a function to reduce a string to characters safe for use in a file name. 您甚至可能会找到一个库(Apache Commons?),该库提供了将字符串简化为可在文件名中安全使用的字符的功能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM