C＃File.ReadallText做了奇怪的事情

Question

What I'm trying do is read all the text in a file and if it contains the word "Share" do a regex. 我正在尝试做的是读取文件中的所有文本，如果它包含单词“共享”做一个正则表达式。 Here is the code: 这是代码：

DirectoryInfo dinfo = new DirectoryInfo(@"C:\Documents and Settings\g\Desktop\123");
        FileInfo[] Files = dinfo.GetFiles("*.txt");
        foreach (FileInfo filex in Files)
        {
            string contents = File.ReadAllText(filex.FullName);
            string matchingcontants = "Share";
            if (contents.Contains(matchingcontants))
            {
                string sharename = Regex.Match(contents, @"\+(\S*)(.)(.*)(.)").Groups[3].Value;
                File.AppendAllText(@"C:\sharename.txt", sharename + @"\r\n");
            }

        }

When I debug I get... contents = "\\r\\0\\n\\0=\\0=\\0=\\0=\\0=\\0=\\0=\\0=\\0=\\0=\\0=\\0=\\0=\\0=\\0=\\0=\\0=\\0=\\0=\\0=\\0=\\0=\\0=\\0=\\0=\\0=\\0=\\0=\\0=\\0\\r\\0\\n\\0+\\0S\\0h\\0a\\0r\\0e\\0 \\0\\\\0\\\\0j\\05\\02\\0\\\\0w\\0w\\0w\\0_\\0O\\0n\\0t\\ 当我调试时，我得到...... contents =“\\ r \\ 0 \\ n \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 = \\ 0 \\ r \\ 0 \\ n \\ 0 + \\ 0S \\ 0h \\ 0a \\ 0r \\ 0e \\ 0 \\ 0 \\\\ 0 \\\\ 0j \\ 05 \\ 02 \\ 0 \\\\ 0w \\ 0w \\ 0w \\ 0_ \\ 0O \\ 0N \\0吨\\

\\0S\\0h\\0a\\0r\\0e\\ \\ 0S \\ 0H \\ 0A \\ 0R \\ 0E \\

Not Share. 不共享。 Any hints? 任何提示？ tips or suggestions? 提示或建议？

Answer 1

Looks like you've got a file which is saved as UTF-16 (ie Encoding.Unicode ). 看起来你有一个保存为UTF-16的文件（即Encoding.Unicode ）。 Read the file with the right encoding, and all should be well. 使用正确的编码读取文件，一切都应该很好。

Fortunately there's an overload of File.ReadAllText which takes an encoding: 幸运的是，File.ReadAllText的重载采用了编码：

string contents = File.ReadAllText(filex.FullName, Encoding.Unicode);

Unfortunately, that will then do the wrong thing for files which aren't in UTF-16. 不幸的是，对于不是 UTF-16的文件，这将是错误的。 While there are heuristic ways of guessing the encoding, ideally you should know the encoding before you open the file. 虽然存在猜测编码的启发式方法，但理想情况下，您应该在打开文件之前知道编码。

Answer 2

看起来它是一个Unicode文件，并且您尝试将其作为纯ASCII读取。

Answer 3

我的猜测是编码设置不正确，您可能需要使用指定编码的ReadAllText（String，Encoding）。

C＃File.ReadallText做了奇怪的事情

问题描述

3 个解决方案

解决方案1
7 已采纳 2010-02-10 20:10:01

解决方案2
2 2010-02-10 20:09:31

解决方案3
2 2010-02-10 20:10:56

C＃File.ReadallText做了奇怪的事情

问题描述

3 个解决方案

解决方案1 7 已采纳 2010-02-10 20:10:01

解决方案2 2 2010-02-10 20:09:31

解决方案3 2 2010-02-10 20:10:56

解决方案1
7 已采纳 2010-02-10 20:10:01

解决方案2
2 2010-02-10 20:09:31

解决方案3
2 2010-02-10 20:10:56