简体   繁体   中英

How do I take an XML file with nested elements and get a set of C# classes from it?

First off, I'm not terribly experienced in XML. I know the very basics of reading in and writing it, but for the most part, things like schemas start to make my eyes cross really quickly. If it looks like I'm making incorrect assumptions about how XML works, there's a good chance that I am.

That disclaimer aside, this is a problem I've run into several times without finding an agreeable solution. I have an XML which defines data, including nested entries (to give an example, a file might have a "Power" element which has a child node of "AlternatePowers" which in turn contains "Power" elements). Ideally, I would like to be able to generate a quick set of classes from this XML file to store the data I'm reading in. The general solution I've seen is to use Microsoft's XSD.exe tool to generate an XSD file from the XML file and then use the same tool to convert the schema into classes. The catch is, the tool chokes if there are nested elements. Example:

- A column named 'Power' already belongs to this DataTable: cannot set 
a nested table name to the same name.

Is there a nice simple way to do this? I did a couple of searches for similar questions here, but the only questions I found dealing with generating schemas with nested elements with the same name were unanswered.

Alternately, it's also possible that I am completely misunderstanding how XML and XSD work and it's not possible to have such nesting...

Update

As an example, one of the things I'd like to parse is the XML output of a particular character builder program. Fair warning, this is a bit wordy despite me removing anything but the powers section.

<?xml version="1.0" encoding="ISO-8859-1"?>
<document>
  <product name="Hero Lab" url="http://www.wolflair.com" versionmajor="3" versionminor="7" versionpatch=" " versionbuild="256">Hero Lab® and the Hero Lab logo are Registered Trademarks of LWD Technology, Inc. Free download at http://www.wolflair.com
    Mutants &amp; Masterminds, Second Edition is ©2005-2011 Green Ronin Publishing, LLC. All rights reserved.</product>
  <hero active="yes" name="Pretty Deadly" playername="">
    <size name="Medium"/>
    <powers>
      <power name="Enhanced Trait 16" info="" ranks="16" cost="16" range="" displaylevel="0" summary="Traits: Constitution +6 (18, +4), Dexterity +8 (20, +5), Charisma +2 (12, +1)" active="yes">
        <powerdesc>You have an enhancement to a non-effect trait, such as an ability (including saving throws) or skill (including attack or defense bonus). Since Toughness save cannot be increased on its own,use the Protection effect instead of Enhanced Toughness (see Protection later in this chapter).</powerdesc>
        <descriptors/>
        <elements/>
        <options/>
        <traitmods>
          <traitmod name="Constitution" bonus="+6"/>
          <traitmod name="Dexterity" bonus="+8"/>
          <traitmod name="Charisma" bonus="+2"/>
        </traitmods>
        <flaws/>
        <powerfeats/>
        <powerdrawbacks/>
        <usernotes/>
        <alternatepowers/>
        <chainedpowers/>
        <otherpowers/>
      </power>
      <power name="Sailor Suit (Device 2)" info="" ranks="2" cost="8" range="" displaylevel="0" summary="Hard to lose" active="yes">
        <powerdesc>A device that has one or more powers and can be equipped and un-equipped.</powerdesc>
        <descriptors/>
        <elements/>
        <options/>
        <traitmods/>
        <flaws/>
        <powerfeats/>
        <powerdrawbacks/>
        <usernotes/>
        <alternatepowers/>
        <chainedpowers/>
        <otherpowers>
          <power name="Protection 6" info="+6 Toughness" ranks="6" cost="10" range="" displaylevel="1" summary="+6 Toughness; Impervious [4 ranks only]" active="yes">
            <powerdesc>You're particularly resistant to harm. You gain a bonus on your Toughness saving throws equal to your Protection rank.</powerdesc>
            <descriptors/>
            <elements/>
            <options/>
            <traitmods/>
            <extras>
              <extra name="Impervious" info="" partialranks="2">Your Protection stops some damage completely. If an attack has a damage bonus less than your Protection rank, it inflicts no damage (you automatically succeed on your Toughness saving throw). Penetrating damage (see page 112) ignores this modifier; you must save against it normally.</extra>
            </extras>
            <flaws/>
            <powerfeats/>
            <powerdrawbacks/>
            <usernotes/>
            <alternatepowers/>
            <chainedpowers/>
            <otherpowers/>
          </power>
        </otherpowers>
      </power>
    </powers>
  </hero>
</document>

Yes, there are a number of unnecessary tags in there, but it's an example of the kind of XML that I'd like to be able to plug in and get something reasonable. This XML, when sent into XSD, generates the following error:

- A column named 'traitmods' already belongs to this DataTable: cannot set
a nested table name to the same name.

I just finished helping someone with that. Try reading this thread here: https://stackoverflow.com/a/8840309/353147

Taking from your example and my link, you'd have classes like this.

public class Power
{
    XElement self;

    public Power(XElement power) { self = power; }

    public AlternatePowers AlternatePowers
    { get { return new AlternatePowers(self.Element("AlternatePowers")); } }
}

public class AlternatePowers
{
    XElement self;

    public AlternatePowers(XElement power) { self = power; }

    public Power2[] Powers
    { 
        get 
        { 
            return self.Elements("Power").Select(e => new Power2(e)).ToArray();
        }
    }
}

public class Power2
{
    XElement self;

    public Power2(XElement power) { self = power; }
}

Without knowing the rest of your xml, I cannot make the properties that make up each class/node level, but you should get the gist from here and from the link.

You'd then reference it like this:

Power power = new Power(XElement.Load("file"));
foreach(Power2 power2 in power.AlternatePowers.Powers)
{
    ...
}

Your error message implies that you are trying to generate a DataSet from the schema ( /d switch), as opposed to a set of arbitrary classes decorated with XML Serializer attributes ( /c switch).

I've not tried generating a DataSet like that myself, but I can see how it might fail. A DataSet is a collection of DataTable s, which in turn contain a collection of DataRow s. That's a fixed 3-level hierarchy. If your XML schema is more or less than 3 levels deep, then it won't fit into the required structure. Try creating a test DataSet in the designer and examine the generated .xsd file; that will show you what kind of schema structure will fit.

I can assure you from personal experience, if you convert the schema to a set of arbitrary classes instead, then it will handle pretty much any schema structure that you care to throw at it.

So, it's not pretty, but the following is what I wound up with as a solution. I run processElement on the base node and then I go through extantElements and export the class code.

namespace XMLToClasses
{
    public class Element
    {
        public string Name;
        public HashSet<string> attributes;
        public HashSet<string> children;

        public bool hasText;

        public Element()
        {
            Name = "";

            attributes = new HashSet<string>();
            children = new HashSet<string>();

            hasText = false;
        }

    public string getSource()
        {
            StringBuilder sourceSB = new StringBuilder();

            sourceSB.AppendLine("[Serializable()]");
            sourceSB.AppendLine("public class cls_" + Name);
            sourceSB.AppendLine("{");

            sourceSB.AppendLine("\t// Attributes" );

            if (hasText)
            {
                sourceSB.AppendLine("\tstring InnerText;");
            }

            foreach(string attribute in attributes)
            {
                sourceSB.AppendLine("\tpublic string atr_" + attribute + ";");
            }
            sourceSB.AppendLine("");
            sourceSB.AppendLine("\t// Children");
            foreach (string child in children)
            {
                sourceSB.AppendLine("\tpublic List<cls_" + child + "> list" + child + ";");
            }

            sourceSB.AppendLine("");
            sourceSB.AppendLine("\t// Constructor");
            sourceSB.AppendLine("\tpublic cls_" + Name + "()");
            sourceSB.AppendLine("\t{");
            foreach (string child in children)
            {
                sourceSB.AppendLine("\t\tlist" + child + " = new List<cls_" + child + ">()" + ";");
            }
            sourceSB.AppendLine("\t}");

            sourceSB.AppendLine("");
            sourceSB.AppendLine("\tpublic cls_" + Name + "(XmlNode xmlNode) : this ()");
            sourceSB.AppendLine("\t{");

            if (hasText)
            {
                sourceSB.AppendLine("\t\t\tInnerText = xmlNode.InnerText;");
                sourceSB.AppendLine("");
            }            

            foreach (string attribute in attributes)
            {
                sourceSB.AppendLine("\t\tif (xmlNode.Attributes[\"" + attribute + "\"] != null)");
                sourceSB.AppendLine("\t\t{");
                sourceSB.AppendLine("\t\t\tatr_" + attribute + " = xmlNode.Attributes[\"" + attribute + "\"].Value;");
                sourceSB.AppendLine("\t\t}");
            }

            sourceSB.AppendLine("");

            foreach (string child in children)
            {
                sourceSB.AppendLine("\t\tforeach (XmlNode childNode in xmlNode.SelectNodes(\"./" + child + "\"))");
                sourceSB.AppendLine("\t\t{");
                sourceSB.AppendLine("\t\t\tlist" + child + ".Add(new cls_" + child + "(childNode));");
                sourceSB.AppendLine("\t\t}");
            }

            sourceSB.AppendLine("\t}");

            sourceSB.Append("}");

            return sourceSB.ToString();
        }
    }

    public class XMLToClasses
    {
        public Hashtable extantElements;

        public XMLToClasses()
        {
            extantElements = new Hashtable();
        }

        public Element processElement(XmlNode xmlNode)
        {
            Element element;

            if (extantElements.Contains(xmlNode.Name))
            {
                element = (Element)extantElements[xmlNode.Name];
            }
            else
            {
                element = new Element();
                element.Name = xmlNode.Name;

                extantElements.Add(element.Name, element);
            }            

            if (xmlNode.Attributes != null)
            {
                foreach (XmlAttribute attribute in xmlNode.Attributes)
                {
                    if (!element.attributes.Contains(attribute.Name))
                    {
                        element.attributes.Add(attribute.Name);
                    }
                }
            }


            if (xmlNode.ChildNodes != null)
            {
                foreach (XmlNode node in xmlNode.ChildNodes)
                {
                    if (node.Name == "#text")
                    {
                        element.hasText = true;
                    }
                    else
                    {
                        Element childNode = processElement(node);

                        if (!element.children.Contains(childNode.Name))
                        {
                            element.children.Add(childNode.Name);
                        }
                    }
                }
            }

            return element;
        }
    }
}

I'm sure there's ways to make this look more pretty or work better, but it's sufficient for me.

Edit: And ugly but functional deserialization code added to take an XMLNode containing the object and decode it.

Later Thoughts: Two years later, I had an opportunity to re-use this code. Not only have I not kept it up to date here (I'd made changes to better normalize the names of the items), but I think that the commenters saying that I was going about this the wrong way were right. I still think this could be a handy way of generating template classes for an XML file where a given type of element could show up at different depths, but it's inflexible (you have to rerun the code and re-extract the classes every time) and doesn't nicely handle changes in versioning (between when I first created this code to allow me to quickly create a character file converter and now, the format changed, so I had people complaining that it stopped working. In retrospect, it would have made more sense to search for the correct elements using XPaths and then pull the data from there).

Still, it was a valuable experience, and I suspect I'm probably going to come back to this code from time to time for quickly roughing out XML data, at least until I find something better.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM