简体   繁体   English

使用windows-1252读写文件

[英]Read and write file with windows-1252

I'm trying to write a file containing some German characters to disk and read it using Windows-1252 encoding. 我正在尝试将包含一些德语字符的文件写入磁盘并使用Windows-1252编码进行读取。 I don't understand why, but my output is like this: 我不明白为什么,但我的输出是这样的:

<title>W�hrend und im Anschluss an die Exkursion stehen Ihnen die Ansprechpartner f�r O-T�ne</title>

<p>Die Themen im �berblick</p>

Any thoughts? 有什么想法吗? Here is my code. 这是我的代码。 You'll need spring-core and commons-io to run it. 你需要spring-core和commons-io来运行它。

private static void write(String fileName, Charset charset) throws IOException {
    String html = "<html xmlns=\"http://www.w3.org/1999/xhtml\">" +
                  "<head>" +
                  "<meta http-equiv=\"Content-Type\" content=\"text/html; charset=windows-1252\">" +
                  "<title>Während und im Anschluss an die Exkursion stehen Ihnen die Ansprechpartner für O-Töne</title>" +
                  "</head>" +
                  "<body>" +
                  "<p>Die Themen im Überblick</p>" +
                  "</body>" +
                  "</html>";

    byte[] bytes = html.getBytes(charset);
    FileOutputStream outputStream = new FileOutputStream(fileName);
    OutputStreamWriter writer = new OutputStreamWriter(outputStream, charset);
    IOUtils.write(bytes, writer);
    writer.close();
    outputStream.close();
}

private static void read(String file, Charset windowsCharset) throws IOException {
    ClassPathResource pathResource = new ClassPathResource(file);
    String string = IOUtils.toString(pathResource.getInputStream(), windowsCharset);
    System.out.println(string);
}

public static void main(String[] args) throws IOException {
    Charset windowsCharset = Charset.forName("windows-1252");
    String file = "test.txt";
    write(file, windowsCharset);
    read(file, windowsCharset);
}

Your write method is wrong. 你的写法错了。 You are using a writer to write bytes . 您正在使用写入器来写入字节 A writer should be used for writing characters or strings. 应该使用编写器来编写字符或字符串。

You already encoded the string into bytes with the line 您已经使用该行将字符串编码为字节

byte[] bytes = html.getBytes(charset);

These bytes can simply be written into an output stream: 这些字节可以简单地写入输出流:

IOUtils.write(bytes, outputStream);

This makes the writer unnecessary (remove it) and you will now get the correct output. 这使得编写者不必要(删除它),现在您将获得正确的输出。

First ensure that the compiler and editor use the same encoding. 首先确保编译器和编辑器使用相同的编码。 This can be checked trying the (ugly) \\uXXXX escaping: 这可以通过尝试(丑陋) \\uXXXX转义来检查:

während
w\u00E4hrend

Then 然后

    "<meta http-equiv='Content-Type' content='text/html; charset="
    + charset.name() + "' />" +

    byte[] bytes = html.getBytes(charset);
    Files.write(Paths.get(fileName), bytes);

Ahh, check that the file is in Windows-1252 too. 啊,检查文件是否也在Windows-1252中。 A programmer's editor like NotePad++ or JEdit allows to play with encodings. 像NotePad ++或JEdit这样的程序员编辑器允许使用编码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM