简体   繁体   English

c#Registry to XML无效的字符问题

[英]c# Registry to XML Invalid character issue

I have a problem when trying to create an XML file from registry. 尝试从注册表创建XML文件时遇到问题。 On my laptop(W7 64b) it is working fine, the xml file is generated but on another computer (Xp 32b) an exception is thrown : System.ArgumentException '.', hexadecimal values 0x00, is an invalid character. 在我的笔记本电脑(W7 64b)上工作正常,xml文件生成但在另一台计算机(Xp 32b)上抛出异常:System.ArgumentException'。',十六进制值0x00,是无效字符。 I have read few useful things about it but I don't know how to solve in this case, here is the code : 我已经阅读了一些有用的东西,但我不知道在这种情况下如何解决,这里是代码:

        try
        {

        string regPath = "SOFTWARE\\IPS";
        XElement xRegRoot = new XElement("Root", new XAttribute("Registry", regPath));

        ReadRegistry(regPath, xRegRoot);

        string xmlStringReg = xRegRoot.ToString();

        XmlDocument docR = new XmlDocument();
        docR.LoadXml(xmlStringReg);

            docR.Save(AppDomain.CurrentDomain.BaseDirectory + "\\_RegistryList.xml");
        }
        catch (System.Exception ex)
        {
            Console.WriteLine(ex.ToString());
            LogToFile(ex.ToString());
        }

    private static void ReadRegistry(string keyPath, XElement xRegRoot)
    {
        string[] subKeys=null;
        RegistryKey HKLM = Registry.LocalMachine;
        RegistryKey RegKey = HKLM.OpenSubKey(keyPath);

        try
        {
            subKeys = RegKey.GetSubKeyNames();
            foreach (string subKey in subKeys)
            {
                string fullPath = keyPath + "\\" + subKey;                    
                Console.WriteLine("\r\nKey Name  | " + fullPath);
                LogToFile("Key Name  | " + fullPath);

                XElement xregkey = new XElement("RegKeyName", new XAttribute("FullName", fullPath), new XAttribute("Name", subKey));
                xRegRoot.Add(xregkey);
                ReadRegistry(fullPath, xRegRoot);
            }

            string[] subVals = RegKey.GetValueNames();
            foreach (string val in subVals)
            {
                string keyName = val;
                string keyType = RegKey.GetValueKind(val).ToString();
                string keyValue = RegKey.GetValue(val).ToString();

                Console.WriteLine("Key Value | " + keyType + " | " + keyName + " | " + keyValue);
                LogToFile("Key " + keyType + " | " + keyName + " | " + keyValue);
                XElement xregvalue = new XElement("RegKeyValue", new XAttribute("keyType", keyType), new XAttribute("keyName", keyName), new XAttribute("keyValue", keyValue));
                xRegRoot.Add(xregvalue);
            }
        }
        catch (System.Exception ex)
        {
            Console.WriteLine(ex.ToString());
            LogToFile(ex.ToString());
        }
    }

Thanks in advance. 提前致谢。

I did some experiments: 我做了一些实验:

  • new XElement("foo\\x00bar") throws on construction. new XElement("foo\\x00bar")引发施工。
  • new XAttribute("foo\\x00bar", "baz") throws on construction. new XAttribute("foo\\x00bar", "baz")引发了构建。
  • new XText("foo\\x00bar") throws only when calling .ToString() . new XText("foo\\x00bar")仅在调用.ToString()时抛出。

new XAttribute("foo", "bar\\x00baz") is equivalent to new XAttribute("foo", new XText("bar\\x00baz")) , so it won't throw on construction. new XAttribute("foo", "bar\\x00baz")相当于new XAttribute("foo", new XText("bar\\x00baz")) ,所以它不会投入构造。

I did not manage to make any of the registry-methods return a string with null-characters, but you should be able to find where this is returned yourself. 我没有设法使任何注册表方法返回一个带有空字符的字符串,但你应该能够找到自己返回的位置。

You can read more about it here: http://seattlesoftware.wordpress.com/2008/09/11/hexadecimal-value-0-is-an-invalid-character/ 你可以在这里阅读更多相关信息: http//seattlesoftware.wordpress.com/2008/09/11/hexadecimal-value-0-is-an-invalid-character/

And more about it here: XElement & UTF-8 Issue 更多关于它: XElement和UTF-8问题

A valid list of xml chars are here http://en.wikipedia.org/wiki/Valid_characters_in_XML 这里有一个有效的xml字符列表http://en.wikipedia.org/wiki/Valid_characters_in_XML

But essentially you can fix it by removing illegal chars before serialising 但基本上你可以通过在序列化之前删除非法字符来解决它

/// <summary>
/// Remove illegal XML characters from a string.
/// </summary>
public string SanitizeXmlString(string xml)
{
    if (string.IsNullOrEmpty(value))
    {
        return value;
    }

    StringBuilder buffer = new StringBuilder(xml.Length);

    foreach (char c in xml)
    {
        if (IsLegalXmlChar(c))
        {
            buffer.Append(c);
        }
    }

    return buffer.ToString();
}

/// <summary>
/// Whether a given character is allowed by XML 1.0.
/// </summary>
public bool IsLegalXmlChar(int character)
{
    return
    (
         character == 0x9 /* == '\t' == 9   */          ||
         character == 0xA /* == '\n' == 10  */          ||
         character == 0xD /* == '\r' == 13  */          ||
        (character >= 0x20    && character <= 0xD7FF  ) ||
        (character >= 0xE000  && character <= 0xFFFD  ) ||
        (character >= 0x10000 && character <= 0x10FFFF)
    );
}

Here are a couple little improvements that a) compile, and b) handle surrogate pairs: 这里有一些小的改进,a)编译,和b)处理代理对:

    /// <summary>
    /// Remove illegal XML characters from a string.
    /// </summary>
    public static string SanitizeString(string s)
    {
        if (string.IsNullOrEmpty(s))
        {
            return s;
        }

        StringBuilder buffer = new StringBuilder(s.Length);

        for (int i = 0; i < s.Length; i++)
        {
            int code;
            try
            {
                code = Char.ConvertToUtf32(s, i);
            }
            catch (ArgumentException)
            {
                continue;
            }
            if (IsLegalXmlChar(code))
                buffer.Append(Char.ConvertFromUtf32(code));
            if (Char.IsSurrogatePair(s, i))
                i++;
        }

        return buffer.ToString();
    }

    /// <summary>
    /// Whether a given character is allowed by XML 1.0.
    /// </summary>
    private static bool IsLegalXmlChar(int codePoint)
    {
        return (codePoint == 0x9 ||
            codePoint == 0xA ||
            codePoint == 0xD ||
            (codePoint >= 0x20 && codePoint <= 0xD7FF) ||
            (codePoint >= 0xE000 && codePoint <= 0xFFFD) ||
            (codePoint >= 0x10000/* && character <= 0x10FFFF*/) //it's impossible to get a code point bigger than 0x10FFFF because Char.ConvertToUtf32 would have thrown an exception
        );
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM