如何将文件从ASCII转换为UTF-8？

Question

I'm trying to transcode a bunch a files from ASCII to UTF-8. 我正在尝试将一堆文件从ASCII转码为UTF-8。

For that, I tried using iconv : 为此，我尝试使用iconv ：

iconv -f US-ASCII -t UTF-8 infile > outfile

-f ENCODING the encoding of the input -f ENCODING输入的编码

-t ENCODING the encoding of the output -t ENCODING输出的编码

Still that file didn't convert to UTF-8. 该文件仍未转换为UTF-8。 It is a .dat file. 它是一个.dat文件。

Before posting this, I searched Google and found information like: 在发布之前，我搜索了Google并找到了以下信息：

ASCII is a subset of UTF-8, so all ASCII files are already UTF-8 encoded. ASCII是UTF-8的子集，因此所有ASCII文件都已经过UTF-8编码。 The bytes in the ASCII file and the bytes that would result from "encoding it to UTF-8" would be exactly the same bytes. ASCII文件中的字节和“将其编码为UTF-8”所产生的字节将完全相同。 There's no difference between them. 它们之间没有区别。

Force encode from US-ASCII to UTF-8 (iconv) 强制编码从US-ASCII到UTF-8（iconv）

Best way to convert text files between character sets? 在字符集之间转换文本文件的最佳方法？

Still the above links didn't help. 上述链接仍无济于事。

Even though it is in ASCII it will support UTF-8 as UTF-8 is a super set, the other party who is going to receive the files from me need file encoding as UTF-8. 即使它是ASCII格式，它也支持UTF-8，因为UTF-8是一个超级集合，另一方要接收我的文件需要文件编码为UTF-8。 He just need file format as UTF-8. 他只需要文件格式为UTF-8。

Any suggestions please. 请给我任何建议。

Answer 1

I'm a little confused by the question, because, as you indicated, ASCII is a subset of UTF-8, so all ASCII files are already UTF-8 encoded. 我对这个问题感到有点困惑，因为正如你所说，ASCII是UTF-8的一个子集，因此所有的ASCII文件都已经过UTF-8编码了。

If you're sending files containing only ASCII characters to the other party, but the other party is complaining that they're not 'UTF-8 Encoded', then I would guess that they're referring to the fact that the ASCII file has no byte order mark explicitly indicating the contents are UTF-8. 如果您只向另一方发送仅包含ASCII字符的文件，但另一方抱怨他们不是'UTF-8编码'，那么我猜他们指的是ASCII文件有没有明确指示内容为UTF-8的字节顺序标记。

If that is indeed the case, then you can add a byte order mark using the answer here: 如果确实如此，那么您可以使用以下答案添加字节顺序标记：

iconv: Converting from Windows ANSI to UTF-8 with BOM iconv：使用BOM从Windows ANSI转换为UTF-8

If the other party indicates that he does not need the 'BOM' (Byte Order Mark), but is still complaining that the files are not UTF-8, then another possibility is that your initial file is not actually ASCII, but rather contains characters that are encoded using ANSI or ISO-8859-1. 如果对方表示他不需要“BOM”（字节顺序标记），但仍然抱怨文件不是UTF-8，那么另一种可能性是你的初始文件实际上不是ASCII，而是包含字符使用ANSI或ISO-8859-1编码的。

Edited to add the following experiment, after comment from Ram regarding the other party looking for the type using the 'file' command 编辑后添加以下实验，在Ram对使用'file'命令寻找类型的另一方发表评论后

Tims-MacBook-Pro:~ tjohns$ echo 'Stuff' > deleteme
Tims-MacBook-Pro:~ tjohns$ cat deleteme
Stuff
Tims-MacBook-Pro:~ tjohns$ file -I deleteme
deleteme: text/plain; charset=us-ascii
Tims-MacBook-Pro:~ tjohns$ echo -ne '\xEF\xBB\xBF' > deleteme
Tims-MacBook-Pro:~ tjohns$ echo 'Stuff' >> deleteme
Tims-MacBook-Pro:~ tjohns$ cat deleteme
Stuff
Tims-MacBook-Pro:~ tjohns$ file -I deleteme
deleteme: text/plain; charset=utf-8

如何将文件从ASCII转换为UTF-8？

问题描述

1 个解决方案

解决方案1
12 2015-02-07 09:02:16

如何将文件从ASCII转换为UTF-8？

问题描述

1 个解决方案

解决方案1 12 2015-02-07 09:02:16

解决方案1
12 2015-02-07 09:02:16