简体   繁体   English

将XML解析为列表

[英]Parsing XML into a list

I have a quite elaborate XML I have been able to parse most of it however im coming across a tree that just has me stumped and im afraid that I'm making harder then it needs to be. 我有一个非常详尽的XML,我能够解析其中的大部分内容,但是我碰到了一棵树,这让我很沮丧,而且恐怕我会变得比现在更难了。 here is the XML I'm referring to. 这是我指的XML。

<Codes>
            <CustomFieldValueSet name="Account" label="Account" distributionType="PercentOfPrice">
                <CustomFieldValue distributionValue="10.00" splitindex="0">
                    <Value>7200</Value>
                    <Description>General Supplies</Description>
                </CustomFieldValue>
                <CustomFieldValue distributionValue="45.00" splitindex="1">
                    <Value>7200</Value>
                    <Description>General Supplies</Description>
                </CustomFieldValue>
                <CustomFieldValue distributionValue="45.00" splitindex="2">
                    <Value>7200</Value>
                    <Description>General Supplies</Description>
                </CustomFieldValue>
            </CustomFieldValueSet>
            <CustomFieldValueSet name="Activity" label="Activity" distributionType="PercentOfPrice" />
            <CustomFieldValueSet name="Chart" label="Chart" distributionType="PercentOfPrice">
                <CustomFieldValue distributionValue="10.00" splitindex="0">
                    <Value>T</Value>
                    <Description>University</Description>
                </CustomFieldValue>
                <CustomFieldValue distributionValue="45.00" splitindex="1">
                    <Value>T</Value>
                    <Description>University</Description>
                </CustomFieldValue>
                <CustomFieldValue distributionValue="45.00" splitindex="2">
                    <Value>T</Value>
                    <Description>University</Description>
                </CustomFieldValue>
            </CustomFieldValueSet>
            <CustomFieldValueSet name="Fund" label="Fund" distributionType="PercentOfPrice">
                <CustomFieldValue distributionValue="10.00" splitindex="0">
                    <Value>360806</Value>
                    <Description>National Institutes of Health</Description>
                </CustomFieldValue>
                <CustomFieldValue distributionValue="45.00" splitindex="1">
                    <Value>360903</Value>
                    <Description>National  Institutes of Health</Description>
                </CustomFieldValue>
                <CustomFieldValue distributionValue="45.00" splitindex="2">
                    <Value>360957</Value>
                    <Description>National Institutes of Health</Description>
                </CustomFieldValue>
            </CustomFieldValueSet>
            <CustomFieldValueSet name="Program" label="Program" distributionType="PercentOfPrice">
                <CustomFieldValue distributionValue="10.00" splitindex="0">
                    <Value>02</Value>
                    <Description>Research</Description>
                </CustomFieldValue>
                <CustomFieldValue distributionValue="45.00" splitindex="1">
                    <Value>02</Value>
                    <Description>Research</Description>
                </CustomFieldValue>
                <CustomFieldValue distributionValue="45.00" splitindex="2">
                    <Value>02</Value>
                    <Description>Research</Description>
                </CustomFieldValue>
            </CustomFieldValueSet>
            <CustomFieldValueSet name="Location" label="Location" distributionType="PercentOfPrice">
                <CustomFieldValue distributionValue="10.00" splitindex="0">
                    <Value>015</Value>
                    <Description>Biology - Life Science</Description>
                </CustomFieldValue>
                <CustomFieldValue distributionValue="45.00" splitindex="1">
                    <Value>015</Value>
                    <Description>Biology - Life Science</Description>
                </CustomFieldValue>
                <CustomFieldValue distributionValue="45.00" splitindex="2">
                    <Value>015</Value>
                    <Description>Biology - Life Science</Description>
                </CustomFieldValue>
            </CustomFieldValueSet>
            <CustomFieldValueSet name="Organization" label="Organization" distributionType="PercentOfPrice">
                <CustomFieldValue distributionValue="10.00" splitindex="0">
                    <Value>04400</Value>
                    <Description>TUSM:Neuroscience</Description>
                </CustomFieldValue>
                <CustomFieldValue distributionValue="45.00" splitindex="1">
                    <Value>04400</Value>
                    <Description>TUSM:Neuroscience</Description>
                </CustomFieldValue>
                <CustomFieldValue distributionValue="45.00" splitindex="2">
                    <Value>04400</Value>
                    <Description>TUSM:Neuroscience</Description>
                </CustomFieldValue>
            </CustomFieldValueSet>
        </Codes>

I'm trying to end up with a list the would look something like this. 我试图给出一个看起来像这样的列表。

Account distributionType   Activity   distributionValue  Fund
7200     PercentOfPrice     ""        10                 360806
7200     PercentOfPrice     ""        45                 360903
7200     PercentOfPrice     ""        45                 360957

etc... 等等...

I have written code the looks something like this. 我写的代码看起来像这样。 Here is a snippet. 这是一个片段。 Mind you I think i have over complicated this. 介意您,我想我已经把这个复杂化了。

if (tagName == "Codes")
                                {
                                  // Create another reader that contains just the accounting elements.
                                    XmlReader inner = reader.ReadSubtree();
                                    //inner.ReadToDescendant("Codes");
                                    //printOutXML(inner);
                                    while (inner.Read())
                                    {
                                        switch (inner.NodeType)
                                        {       
                                            //walk down the xml hiearchy then simply  fill in the values.
                                            case XmlNodeType.Element:

                                                switch (reader.Name)
                                                {
                                                    case "CustomFieldValueSet":
                                                       //get the attribute that we are currently working with such as account and  
                                                        innerTagName=inner.GetAttribute("name");

                                                        // activity and location can potentially be blank therefore i will check here if it is 
                                                        //and if it is i will immediate assign the activity list a set of empty quotes.
                                                        if (innerTagName == "Activity")
                                                        {
                                                            if (inner.IsEmptyElement)
                                                            {   //quickly put fillers in .
                                                                for (int i = 0; i < thisInvoice.account.Count; i++)
                                                                {
                                                                    thisInvoice.activity.Add("");
                                                                }
                                                            }         
                                                        }

                                                        if (innerTagName == "Location")
                                                        {
                                                            if (inner.IsEmptyElement)
                                                            {   //quickly put fillers in .
                                                                for (int i = 0; i < thisInvoice.account.Count; i++)
                                                                {
                                                                    thisInvoice.location.Add("");
                                                                }
                                                                //thisInvoice.activity.Add("");
                                                            }
                                                        }

                                                        if (null == inner.GetAttribute("distributionType"))
                                                        {
                                                            distType = null;
                                                        }
                                                       else if
                                                       (distributionSwitch == false)
                                                        {
                                                            thisInvoice.distributionType.Add(inner.GetAttribute("distributionType") ?? "");
                                                            distType = inner.GetAttribute("distributionType") ?? "";
                                                       }
                                                        //Console.WriteLine(inner.Value);
                                                        //Console.WriteLine(inner.Name);
                                                        break;

                                                    case "CustomFieldValue":
                                                        if(null == inner.GetAttribute("distributionValue"))
                                                        //thisInvoice.distributionValue.Add(inner.GetAttribute("distributionValue") ?? "");
                                                        {/*do nothing*/}
                                                    else if
                                                        (distributionSwitch == false)
                                                        {
                                                            thisInvoice.distributionValue.Add(inner.GetAttribute("distributionValue") ?? "");
                                                        }
                                                        //check the length of the current distribution  if the lenght is less than the curren distribution value
                                                       // then we must then add the values to the new location.
                                                        if (thisInvoice.distributionValue.Count > thisInvoice.distributionType.Count)
                                                        {
                                                            for (int i = 0; i < thisInvoice.distributionValue.Count - thisInvoice.distributionType.Count; i++)
                                                            {
                                                                thisInvoice.distributionType.Add(distType);
                                                            }



                                                        }

                                                        break;

                                                    case "Value":
                                                         // XmlNodeType.Text
                                                        if (innerTagName == "Account"/*&& inner.NodeType ==XmlNodeType.Text*/)
                                                        {
                                                            inner.MoveToContent();// move to the text 
                                                            inner.Read();
                                                            thisInvoice.account.Add(inner.Value);
                                                        }


                                                        if (innerTagName == "Activity")
                                                        {
                                                            // activitiy is not a mandartory field so it could be empty therefore we need 
                                                            // to check if its  a self closing tag and if it is then we need to assign and 
                                                            if (inner.IsEmptyElement)
                                                            {
                                                                thisInvoice.activity.Add("");
                                                            }
                                                            else
                                                            {
                                                                inner.MoveToContent();// move to the text 
                                                                inner.Read();
                                                                thisInvoice.activity.Add(inner.Value);
                                                            }
                                                        }

                                                        if (innerTagName == "Location")
                                                        {
                                                            if (inner.IsEmptyElement)
                                                            {
                                                                thisInvoice.location.Add("");
                                                            }
                                                            else
                                                            {
                                                                inner.MoveToContent();// move to the text 
                                                                inner.Read();
                                                                thisInvoice.location.Add(inner.Value);
                                                            }
                                                        }

                                                        if (innerTagName == "Fund")
                                                        {
                                                            inner.MoveToContent();// move to the text 
                                                            inner.Read();
                                                            thisInvoice.fund.Add(inner.Value);
                                                        }

                                                        if (innerTagName == "Organization")
                                                        {
                                                            inner.MoveToContent();// move to the text 
                                                            inner.Read();
                                                            thisInvoice.org.Add(inner.Value);
                                                        }

                                                        if (innerTagName == "Program")
                                                        {
                                                            inner.MoveToContent();// move to the text 
                                                            inner.Read();
                                                            thisInvoice.prog.Add(inner.Value);
                                                        }

                                                       break;



                                                }//end switch
                                                break;//brake the outside case.
                                            case XmlNodeType.EndElement:
                                                if (inner.Name == "CustomFieldValueSet" || inner.Value == "CustomFieldValueSet")
                                                {
                                                    distributionSwitch = true;
                                                    Console.WriteLine(reader.Value);
                                                    Console.WriteLine(reader.Name);
                                                }
                                                if (inner.Name == "Codes")
                                                {
                                                    distributionSwitch = false;
                                                    distType = null;
                                                    inner.Close();
                                                }

                                                break;
                                        }//end switch
                                    }//end while
                                }//end the if;

In the case of the tag distributionType i need to make the list length as long as the list for account so in other words once i have it on a variable i need to use it as a filler to make the distribution type list as big as the account list. 在标签distributionType的情况下,我需要使列表长度与帐户列表一样长,换句话说,一旦将其包含在变量中,我就需要使用它作为填充符,以使分布类型列表与帐户列表。 I cant imagine that there is not an easier way to do this I keep looking at linq to xml but it does not make much sense. 我无法想象没有比这更简单的方法了,我一直在研究linq to xml,但这没有多大意义。 I would love to hear how some of you experts would tackle this one. 我很想听听你们中的一些专家将如何解决这一问题。 I'm trying to put together an elegant solution with a little less code. 我试图用更少的代码来组合一个优雅的解决方案。 Any help would be greatly appreciated. 任何帮助将不胜感激。

You can use Linq to XML for this. 您可以为此使用Linq to XML

using System.Xml;
using System.Xml.Linq;

static void Main(string[] args) {

// This txt file contains your xml.
var xml_sample = File.ReadAllText("xml_sample.txt");
var doc = XDocument.Parse(xml_sample);

// Get all <CustomFieldValueSet> that have the label attribute `Account`
var accounts = from item in doc.Descendants("Codes").Descendants("CustomFieldValueSet")
               where (item.HasAttributes) && 
                     (item.Attribute("label").Value == "Account")
               select item;

// Create an anonymous type containing the value of the 
// distributionValue attribute and the <Value> node.
var accountValue = from el in accounts.Descendants("CustomFieldValue")
                   let distAttribute = el.Attribute("distributionValue")
                   select new
                   {
                       distValue = distAttribute != null ? distAttribute.Value : "0",
                       value = el.Descendants("Value").First().Value,
                   };

// Display stuff here just to make sure we got it right.
accounts.ToList().ForEach(el => 
    Console.WriteLine(el.Name + " " + el.Attribute("distributionType").Value));

accountValue.ToList().ForEach(el => 
    Console.WriteLine(el.distValue + ":"+ el.value));
}

You should be able to use these ideas to parse your XML file as needed. 您应该能够根据需要使用这些想法来解析XML文件。

As specified in the comments section, an alternative to Mihai 's solution of using LINQ to XML , you can also use a pre-defined class structure to deserialize your XML into typed classes and properties. 如注释部分所述,这是Mihai使用LINQ to XML的解决方案的替代方法,您还可以使用预定义的类结构将XML反序列化为类型化的类和属性。

The benefit of this is that you will then have an object that is a representation of your XML (well hopefully) and allow you to more easily work with the data that was inside the XML 这样做的好处是,您将拥有一个表示XML的对象(希望如此),并使您可以更轻松地处理XML内部的数据。

With the supplied XML sample and using the Edit -> Paste Special -> Paste XML as Classes menu option in Visual Studio, you will get a class structure similar to the one below (this one has been refined a bit for easier reading) 使用提供的XML示例,并使用Visual Studio中的“ 编辑” ->“ 选择性粘贴” ->“将XML粘贴为类”菜单选项,您将获得与以下结构相似的类结构(此结构已经过改进,以方便阅读)

using System.Xml.Serialization;

[XmlTypeAttribute(AnonymousType = true)]
[XmlRootAttribute(Namespace = "", IsNullable = false)]
public partial class Codes
{
  [XmlElementAttribute("CustomFieldValueSet")]
  public List<CodesCustomFieldValueSet> CustomFieldValueSet { get; set; }
}

[XmlTypeAttribute(AnonymousType = true)]
public partial class CodesCustomFieldValueSet
{
  [XmlElementAttribute("CustomFieldValue")]
  public List<CodesCustomFieldValueSetCustomFieldValue> CustomFieldValue { get; set; }

  [XmlAttributeAttribute(AttributeName="name")]
  public string Name { get; set; }

  [XmlAttributeAttribute(AttributeName = "label")]
  public string Label { get; set; }

  [XmlAttributeAttribute(AttributeName = "distributionType")]
  public string DistributionType { get; set; }
}

[XmlTypeAttribute(AnonymousType = true)]
public partial class CodesCustomFieldValueSetCustomFieldValue
{
  public string Value { get; set; }

  public string Description { get; set; }

  [XmlAttributeAttribute(AttributeName = "distributionValue")]
  public decimal DistributionValue { get; set; }

  [XmlAttributeAttribute(AttributeName = "splitindex")]
  public byte SplitIndex { get; set; }
}

With this class structure, you are then able to deserialize your XML with the below lines 通过这种类结构,您可以使用以下几行反序列化XML
(where txtInput.Text is a TextBox I used to hold the sample XML data) (其中txtInput.Text是我用来保存示例XML数据的TextBox)

XmlSerializer serializer = new XmlSerializer(typeof(Codes));
Codes codesInput = serializer.Deserialize(new StringReader(txtInput.Text)) as Codes;

if (codesInput != null)
{
  // Do something with the data
}

NOTE: 注意:
From your desired output and the structure of the sample XML you supplied, there will be a requirement for you to transform the information in the deserialized object into what/how you want it, for that I would recommend creating an additional class structure, combined with a List<T> , to hold all the information as shown in your desired output. 根据所需的输出和提供的示例XML的结构,需要将反序列化对象中的信息转换为所需的内容/方式,为此,我建议创建一个附加的类结构, List<T> ,以保存所需输出中显示的所有信息。

Even better would be if you controlled the XML's structure and could structure it in a better way as to make it more self explanatory than what it currently is, as it seems that the links between each CustomFieldValueSet is the splitindex , which is an attribute of the child nodes, which complicates it a lot. 更好的是,如果您控制XML的结构,并以一种更好的方式对其进行结构化,使其比当前的结构更具解释性,因为似乎每个CustomFieldValueSet之间的链接都是splitindex ,它是子节点,这使其非常复杂。

Further reading on XML Serialization: 进一步阅读XML序列化:
MSDN: Introducing XML Serialization MSDN:XML序列化简介
The XmlSerializer Class XmlSerializer类

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM