简体   繁体   English

如何使用C#从xml文件中删除#text

[英]How to remove #text from xml file using c#

I am trying to read xml file, and while reading I am getting #text as value. 我正在尝试读取xml文件,而在读取时我正在获取#text作为值。 I have removed all whitespaces, then too this #text keep coming.What is the solution? 我删除了所有空格,然后#text也继续出现。解决方案是什么?

This is my original xml file 这是我的原始xml文件

<book genre='novel' ISBN='1-861003-78' misc='sale-item'>
  <title>The Handmaid's Tale</title>
  <price>14.95</price>
</book>

This is my new xml file after removing whitespaces 这是删除空格后的新的xml文件

<!--sample XML fragment--><book genre='novel' ISBN='1-861003-78' misc='sale-item'><title>The Handmaid's Tale</title><price>14.95</price></book>

I am trying to validate two xml files and this is the code 我正在尝试验证两个xml文件,这是代码

 static bool structValidate( XmlNodeList xmlOldNode, XmlNodeList xmlNewNode)
    {

        if (xmlOldNode.Count != xmlNewNode.Count) return true;

        for (var i = 0; i < xmlOldNode.Count; i++)
        {
            var nodeA = xmlOldNode[i];
            var nodeB = xmlNewNode[i];
            Console.WriteLine("\n" + nodeA.Name + ":");
            Console.WriteLine("\n" + nodeB.Name + ":");
            Console.ReadLine();

                if (nodeA.Attributes == null  )
                {
                    if (nodeB.Attributes != null)
                        return true;
                    else
                        continue;
                }


            if (nodeA.Attributes.Count != nodeB.Attributes.Count
            || nodeA.Name != nodeB.Name) return true;


            for (var j = 0; j < nodeA.Attributes.Count; j++)
            {
                var attrA = nodeA.Attributes[j];
                var attrB = nodeB.Attributes[j];
                Console.WriteLine(attrA.Name);
                Console.WriteLine(attrB.Name);
                Console.ReadLine();
                if (attrA.Name != attrB.Name) return true;
            }

            if (nodeA.HasChildNodes && nodeB.HasChildNodes)
            {
                return structValidate(nodeA.ChildNodes, nodeB.ChildNodes);

            }               
            else 
            {
                return false;
            }
        }
       return false;
    }

So while printing I am getting #text 所以在打印时我得到#text

The #text nodes are the whitespace being returned by the parser of your old XML file - the indentation before the <title> and <price> node. #text节点是旧XML文件的解析器返回的空白- <title><price>节点之前的缩进。

The Fault is in your way of loading the old XML file. 错误是您加载旧XML文件的方式。 It is parsing the whitespace as XML nodes. 它将空白解析为XML节点。

Your XML parsing way would see these 2 XML files as same files: 您的XML解析方式会将这两个XML文件视为相同的文件:

<book genre='novel' ISBN='1-861003-78' misc='sale-item'>
  <title>The Handmaid's Tale</title>
  <price>14.95</price>
</book>

<book genre='novel' ISBN='1-861003-78' misc='sale-item'>
someUnformatedText<title>The Handmaid's Tale</title>
someUnformatedText<price>14.95</price>
</book>

This is the documentation for XmlNode.Name 这是XmlNode.Name的文档

The qualified name of the node. 节点的限定名称。 The name returned is dependent on the NodeType of the node: 返回的名称取决于节点的NodeType:

Text -> #text 文字-> #text

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM