简体   繁体   English

如何获取带有嵌套元素的XML文件并从中获取一组C#类?

[英]How do I take an XML file with nested elements and get a set of C# classes from it?

First off, I'm not terribly experienced in XML. 首先,我对XML没有很深的经验。 I know the very basics of reading in and writing it, but for the most part, things like schemas start to make my eyes cross really quickly. 我知道读和写的基本知识,但是在大多数情况下,诸如模式之类的东西开始让我的视线很快变好。 If it looks like I'm making incorrect assumptions about how XML works, there's a good chance that I am. 如果看起来我对XML的工作方式做出了错误的假设,那么我很有可能会成为。

That disclaimer aside, this is a problem I've run into several times without finding an agreeable solution. 除了该免责声明,这是我遇到了好几次都找不到合适的解决方案的问题。 I have an XML which defines data, including nested entries (to give an example, a file might have a "Power" element which has a child node of "AlternatePowers" which in turn contains "Power" elements). 我有一个XML,它定义了包括嵌套条目在内的数据(举个例子,文件可能有一个“ Power”元素,其子节点为“ AlternatePowers”,而子节点又包含“ Power”元素)。 Ideally, I would like to be able to generate a quick set of classes from this XML file to store the data I'm reading in. The general solution I've seen is to use Microsoft's XSD.exe tool to generate an XSD file from the XML file and then use the same tool to convert the schema into classes. 理想情况下,我希望能够从此XML文件生成一组快速的类,以存储正在读取的数据。我所看到的一般解决方案是使用Microsoft的XSD.exe工具从中生成XSD文件。 XML文件,然后使用相同的工具将架构转换为类。 The catch is, the tool chokes if there are nested elements. 要注意的是,如果存在嵌套元素,该工具就会阻塞。 Example: 例:

- A column named 'Power' already belongs to this DataTable: cannot set 
a nested table name to the same name.

Is there a nice simple way to do this? 有没有很好的简单方法来做到这一点? I did a couple of searches for similar questions here, but the only questions I found dealing with generating schemas with nested elements with the same name were unanswered. 我在这里搜索了两个类似问题,但是我发现的唯一问题是生成具有相同名称的嵌套元素的架构时,仍未得到解答。

Alternately, it's also possible that I am completely misunderstanding how XML and XSD work and it's not possible to have such nesting... 或者,也有可能我完全误解了XML和XSD的工作方式,并且不可能有这样的嵌套...

Update 更新资料

As an example, one of the things I'd like to parse is the XML output of a particular character builder program. 例如,我要解析的一件事是特定字符生成器程序的XML输出。 Fair warning, this is a bit wordy despite me removing anything but the powers section. 公平的警告,尽管我删除了除powers部分之外的所有内容,但还是有些罗word。

<?xml version="1.0" encoding="ISO-8859-1"?>
<document>
  <product name="Hero Lab" url="http://www.wolflair.com" versionmajor="3" versionminor="7" versionpatch=" " versionbuild="256">Hero Lab® and the Hero Lab logo are Registered Trademarks of LWD Technology, Inc. Free download at http://www.wolflair.com
    Mutants &amp; Masterminds, Second Edition is ©2005-2011 Green Ronin Publishing, LLC. All rights reserved.</product>
  <hero active="yes" name="Pretty Deadly" playername="">
    <size name="Medium"/>
    <powers>
      <power name="Enhanced Trait 16" info="" ranks="16" cost="16" range="" displaylevel="0" summary="Traits: Constitution +6 (18, +4), Dexterity +8 (20, +5), Charisma +2 (12, +1)" active="yes">
        <powerdesc>You have an enhancement to a non-effect trait, such as an ability (including saving throws) or skill (including attack or defense bonus). Since Toughness save cannot be increased on its own,use the Protection effect instead of Enhanced Toughness (see Protection later in this chapter).</powerdesc>
        <descriptors/>
        <elements/>
        <options/>
        <traitmods>
          <traitmod name="Constitution" bonus="+6"/>
          <traitmod name="Dexterity" bonus="+8"/>
          <traitmod name="Charisma" bonus="+2"/>
        </traitmods>
        <flaws/>
        <powerfeats/>
        <powerdrawbacks/>
        <usernotes/>
        <alternatepowers/>
        <chainedpowers/>
        <otherpowers/>
      </power>
      <power name="Sailor Suit (Device 2)" info="" ranks="2" cost="8" range="" displaylevel="0" summary="Hard to lose" active="yes">
        <powerdesc>A device that has one or more powers and can be equipped and un-equipped.</powerdesc>
        <descriptors/>
        <elements/>
        <options/>
        <traitmods/>
        <flaws/>
        <powerfeats/>
        <powerdrawbacks/>
        <usernotes/>
        <alternatepowers/>
        <chainedpowers/>
        <otherpowers>
          <power name="Protection 6" info="+6 Toughness" ranks="6" cost="10" range="" displaylevel="1" summary="+6 Toughness; Impervious [4 ranks only]" active="yes">
            <powerdesc>You're particularly resistant to harm. You gain a bonus on your Toughness saving throws equal to your Protection rank.</powerdesc>
            <descriptors/>
            <elements/>
            <options/>
            <traitmods/>
            <extras>
              <extra name="Impervious" info="" partialranks="2">Your Protection stops some damage completely. If an attack has a damage bonus less than your Protection rank, it inflicts no damage (you automatically succeed on your Toughness saving throw). Penetrating damage (see page 112) ignores this modifier; you must save against it normally.</extra>
            </extras>
            <flaws/>
            <powerfeats/>
            <powerdrawbacks/>
            <usernotes/>
            <alternatepowers/>
            <chainedpowers/>
            <otherpowers/>
          </power>
        </otherpowers>
      </power>
    </powers>
  </hero>
</document>

Yes, there are a number of unnecessary tags in there, but it's an example of the kind of XML that I'd like to be able to plug in and get something reasonable. 是的,那里有许多不必要的标签,但这是我希望能够插入并获得合理信息的那种XML的示例。 This XML, when sent into XSD, generates the following error: 当将此XML发送到XSD时,会产生以下错误:

- A column named 'traitmods' already belongs to this DataTable: cannot set
a nested table name to the same name.

I just finished helping someone with that. 我刚刚完成了对某人的帮助。 Try reading this thread here: https://stackoverflow.com/a/8840309/353147 尝试在此处阅读此线程: https : //stackoverflow.com/a/8840309/353147

Taking from your example and my link, you'd have classes like this. 从您的示例和我的链接中,您将获得像这样的类。

public class Power
{
    XElement self;

    public Power(XElement power) { self = power; }

    public AlternatePowers AlternatePowers
    { get { return new AlternatePowers(self.Element("AlternatePowers")); } }
}

public class AlternatePowers
{
    XElement self;

    public AlternatePowers(XElement power) { self = power; }

    public Power2[] Powers
    { 
        get 
        { 
            return self.Elements("Power").Select(e => new Power2(e)).ToArray();
        }
    }
}

public class Power2
{
    XElement self;

    public Power2(XElement power) { self = power; }
}

Without knowing the rest of your xml, I cannot make the properties that make up each class/node level, but you should get the gist from here and from the link. 在不了解xml其余部分的情况下,我无法创建构成每个类/节点级别的属性,但是您应该从此处以及从链接中获取要点。

You'd then reference it like this: 然后,您可以像这样引用它:

Power power = new Power(XElement.Load("file"));
foreach(Power2 power2 in power.AlternatePowers.Powers)
{
    ...
}

Your error message implies that you are trying to generate a DataSet from the schema ( /d switch), as opposed to a set of arbitrary classes decorated with XML Serializer attributes ( /c switch). 你的错误消息意味着您要生成一个DataSet从模式( /d开关),而不是一组饰以XML序列化属性(任意类/c开关)。

I've not tried generating a DataSet like that myself, but I can see how it might fail. 我没有尝试自己生成类似的DataSet ,但是我可以看到它可能会失败。 A DataSet is a collection of DataTable s, which in turn contain a collection of DataRow s. DataSetDataTable的集合,而后者又包含DataRow的集合。 That's a fixed 3-level hierarchy. 这是固定的3级层次结构。 If your XML schema is more or less than 3 levels deep, then it won't fit into the required structure. 如果您的XML模式深度不超过3层,那么它将不适合所需的结构。 Try creating a test DataSet in the designer and examine the generated .xsd file; 尝试创建一个测试DataSet在设计和查看生成.xsd文件; that will show you what kind of schema structure will fit. 这将向您展示哪种类型的架构结构适合。

I can assure you from personal experience, if you convert the schema to a set of arbitrary classes instead, then it will handle pretty much any schema structure that you care to throw at it. 我可以根据个人经验向您保证,如果将架构​​转换为一组任意类,那么它将处理几乎所有您想要抛出的架构结构。

So, it's not pretty, but the following is what I wound up with as a solution. 因此,它并不漂亮,但以下是我总结的解决方案。 I run processElement on the base node and then I go through extantElements and export the class code. 我在基本节点上运行processElement,然后遍历extantElements并导出类代码。

namespace XMLToClasses
{
    public class Element
    {
        public string Name;
        public HashSet<string> attributes;
        public HashSet<string> children;

        public bool hasText;

        public Element()
        {
            Name = "";

            attributes = new HashSet<string>();
            children = new HashSet<string>();

            hasText = false;
        }

    public string getSource()
        {
            StringBuilder sourceSB = new StringBuilder();

            sourceSB.AppendLine("[Serializable()]");
            sourceSB.AppendLine("public class cls_" + Name);
            sourceSB.AppendLine("{");

            sourceSB.AppendLine("\t// Attributes" );

            if (hasText)
            {
                sourceSB.AppendLine("\tstring InnerText;");
            }

            foreach(string attribute in attributes)
            {
                sourceSB.AppendLine("\tpublic string atr_" + attribute + ";");
            }
            sourceSB.AppendLine("");
            sourceSB.AppendLine("\t// Children");
            foreach (string child in children)
            {
                sourceSB.AppendLine("\tpublic List<cls_" + child + "> list" + child + ";");
            }

            sourceSB.AppendLine("");
            sourceSB.AppendLine("\t// Constructor");
            sourceSB.AppendLine("\tpublic cls_" + Name + "()");
            sourceSB.AppendLine("\t{");
            foreach (string child in children)
            {
                sourceSB.AppendLine("\t\tlist" + child + " = new List<cls_" + child + ">()" + ";");
            }
            sourceSB.AppendLine("\t}");

            sourceSB.AppendLine("");
            sourceSB.AppendLine("\tpublic cls_" + Name + "(XmlNode xmlNode) : this ()");
            sourceSB.AppendLine("\t{");

            if (hasText)
            {
                sourceSB.AppendLine("\t\t\tInnerText = xmlNode.InnerText;");
                sourceSB.AppendLine("");
            }            

            foreach (string attribute in attributes)
            {
                sourceSB.AppendLine("\t\tif (xmlNode.Attributes[\"" + attribute + "\"] != null)");
                sourceSB.AppendLine("\t\t{");
                sourceSB.AppendLine("\t\t\tatr_" + attribute + " = xmlNode.Attributes[\"" + attribute + "\"].Value;");
                sourceSB.AppendLine("\t\t}");
            }

            sourceSB.AppendLine("");

            foreach (string child in children)
            {
                sourceSB.AppendLine("\t\tforeach (XmlNode childNode in xmlNode.SelectNodes(\"./" + child + "\"))");
                sourceSB.AppendLine("\t\t{");
                sourceSB.AppendLine("\t\t\tlist" + child + ".Add(new cls_" + child + "(childNode));");
                sourceSB.AppendLine("\t\t}");
            }

            sourceSB.AppendLine("\t}");

            sourceSB.Append("}");

            return sourceSB.ToString();
        }
    }

    public class XMLToClasses
    {
        public Hashtable extantElements;

        public XMLToClasses()
        {
            extantElements = new Hashtable();
        }

        public Element processElement(XmlNode xmlNode)
        {
            Element element;

            if (extantElements.Contains(xmlNode.Name))
            {
                element = (Element)extantElements[xmlNode.Name];
            }
            else
            {
                element = new Element();
                element.Name = xmlNode.Name;

                extantElements.Add(element.Name, element);
            }            

            if (xmlNode.Attributes != null)
            {
                foreach (XmlAttribute attribute in xmlNode.Attributes)
                {
                    if (!element.attributes.Contains(attribute.Name))
                    {
                        element.attributes.Add(attribute.Name);
                    }
                }
            }


            if (xmlNode.ChildNodes != null)
            {
                foreach (XmlNode node in xmlNode.ChildNodes)
                {
                    if (node.Name == "#text")
                    {
                        element.hasText = true;
                    }
                    else
                    {
                        Element childNode = processElement(node);

                        if (!element.children.Contains(childNode.Name))
                        {
                            element.children.Add(childNode.Name);
                        }
                    }
                }
            }

            return element;
        }
    }
}

I'm sure there's ways to make this look more pretty or work better, but it's sufficient for me. 我敢肯定有办法使它看起来更漂亮或更好地工作,但这对我来说已经足够了。

Edit: And ugly but functional deserialization code added to take an XMLNode containing the object and decode it. 编辑:并且添加了难看但实用的反序列化代码,以获取包含对象的XMLNode并将其解码。

Later Thoughts: Two years later, I had an opportunity to re-use this code. 后来的想法:两年后,我有机会重用此代码。 Not only have I not kept it up to date here (I'd made changes to better normalize the names of the items), but I think that the commenters saying that I was going about this the wrong way were right. 我不仅在这里没有保持最新(我进行了更改以更好地规范项目的名称),而且我认为评论者说我正在以错误的方式这样做是正确的。 I still think this could be a handy way of generating template classes for an XML file where a given type of element could show up at different depths, but it's inflexible (you have to rerun the code and re-extract the classes every time) and doesn't nicely handle changes in versioning (between when I first created this code to allow me to quickly create a character file converter and now, the format changed, so I had people complaining that it stopped working. In retrospect, it would have made more sense to search for the correct elements using XPaths and then pull the data from there). 我仍然认为这可能是为XML文件生成模板类的便捷方法,其中给定类型的元素可以显示在不同的深度,但这是不灵活的(您必须每次重新运行代码并重新提取类),并且不能很好地处理版本更改(在我第一次创建此代码以允许我快速创建字符文件转换器到现在之间,格式更改之间,因此有人抱怨它停止工作。回想起来,它本来可以更明智的方法是使用XPath搜索正确的元素,然后从那里获取数据)。

Still, it was a valuable experience, and I suspect I'm probably going to come back to this code from time to time for quickly roughing out XML data, at least until I find something better. 尽管如此,这仍然是一次宝贵的经验,我怀疑我可能会不时回到这段代码来快速粗略处理XML数据,至少直到找到更好的方法为止。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM