简体   繁体   English

如何生成.txt文件为UTF-8编码?

[英]How to generate .txt file as a UTF-8 encoded?

I want to write a code to convert UTF-8 encoded in java.I creating "a.txt" file which contains only English characters inside the "a.txt" file.While generating, It's giving me ANSI encoded version but I need UTF-8 encoded version. 我想编写一个代码来转换用Java编码的UTF-8。我创建了“ a.txt”文件,该文件在“ a.txt”文件中仅包含英文字符。生成时,它给我的是ANSI编码版本,但我需要UTF -8编码版本。

Note:- A file does not contain any special characters, it contains only ASCI value. 注意:- 文件不包含任何特殊字符,它仅包含ASCI值。

I have written below code. 我写了下面的代码。

writer = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(file), "UTF-8")); 
writer.write(content);

There is no such thing as ANSI encoding (it is an imprecise term for single byte character sets whose first 128 characters are ASCII); 没有ANSI编码之类的东西(对于前128个字符为ASCII的单字节字符集,这是一个不精确的术语)。 and if your file only contains ASCII (characters 0 - 127), then it doesn't matter if you use UTF-8 or one of the 'ANSI' encodings. 并且如果文件仅包含ASCII码(字符0-127),则使用UTF-8还是“ ANSI”编码之一都没有关系。

Considering that editors infer character sets (or better: guess(!)), and that UTF-8 with only ASCII 0 - 127 is indistinguishable from actual ASCII, or one of the 'ANSI' encodings, this is entirely expected 考虑到编辑器会推断字符集(或更好的是:guess(!)),并且只有ASCII 0-127的UTF-8与实际ASCII或“ ANSI”编码之一无法区分,这完全是可以预期的

This means that if you write a file with - for example - only "ABC" in UTF-8, it is also valid ASCII, Windows-1252, ISO-8859-x, and any other character set that takes ASCII as its starting point. 这意味着,如果你写一个文件-例如-只有"ABC"在UTF-8,这也是有效的ASCII时,Windows-1252,ISO-8859-X和其他任何字符集,采用ASCII为出发点。 The editor cannot decide what the actual character set is, and just reports ANSI. 编辑器无法确定实际的字符集是什么,仅报告ANSI。

In other words: your code is working ok; 换句话说:您的代码可以正常工作; it is just the heuristics of your editor that is guessing the wrong character set. 只是您的编辑器的试探法会猜测错误的字符集。 In the end, text files are just a stream of bytes that only get meaning when applying the right character set; 最后,文本文件只是字节流,只有在应用正确的字符集时才有意义。 the character set is not specified in the file itself. 文件本身未指定字符集。

PS: The code in your question has a typo in that it reference UTF-18 , which does not exist. PS:您问题中的代码有错别字,因为它引用了不存在的UTF-18

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何检测文件是否不是 utf-8 编码? - How to detect if a file is not utf-8 encoded? 如何摆脱UTF-8编码的.txt中的“流氓字符” - How to get rid of “Rogue Chars” in an .txt encoded under UTF-8 如何将自定义编码文件转换为UTF-8(使用Java或使用专用工具) - How to convert custom encoded file to UTF-8 (in Java or with a dedicated tool) 如何编译编码为“UTF-8”的java源文件? - How to compile a java source file which is encoded as “UTF-8”? 如何在Java中创建utf-8编码的文件,以便在notepad ++ / notepad或任何其他文本编辑器中打开时显示为UTF-8编码 - How to create a utf-8 encoded file in java such that it shows as UTF-8 encoded when opened in notepad++/notepad or any other text editor 读取 UTF-8 属性文件并保存为 UTF-8 txt 文件 - Read UTF-8 properties file and save as UTF-8 txt file 如何将具有UTF-8编码的txt文件导入jsp? - How to import txt file with UTF-8 encoding into jsp? 如何将带有特殊字符(UTF-8)的HTML页面保存到txt文件 - How to save an HTML page with special chars (UTF-8) to a txt file 如何在JSoup中打印UTF-8编码的字符 - How to print UTF-8 encoded charecters in JSoup 使用Java BOM发送以UTF-8编码的CSV文件 - Send CSV file encoded in UTF-8 with BOM in Java
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM