简体   繁体   English

使用C#解析XML

[英]Parsing XML with C#

I have an XML file as follows: 我有一个XML文件如下:
XML文件

I uploaded the XML file : http://dl.dropbox.com/u/10773282/2011/result.xml . 我上传了XML文件: http//dl.dropbox.com/u/10773282/2011/result.xml It's a machine generated XML, so you might need some XML viewer/editor. 它是一台机器生成的XML,因此您可能需要一些XML查看器/编辑器。

I use this C# code to get the elements in CoverageDSPriv/Module/* . 我使用这个C#代码来获取CoverageDSPriv/Module/*的元素。

using System;
using System.Xml;
using System.Xml.Linq;

namespace HIR {
  class Dummy {

    static void Main(String[] argv) {

      XDocument doc = XDocument.Load("result.xml");

      var coveragePriv = doc.Descendants("CoverageDSPriv"); //.First();
      var cons = coveragePriv.Elements("Module");

      foreach (var con in cons)
      {
        var id = con.Value;
        Console.WriteLine(id);
      }
    }
  }
}

Running the code, I get this result. 运行代码,我得到了这个结果。

hello.exe6144008016161810hello.exehello.exehello.exe81061hello.exehello.exe!17main_main40030170170010180180011190190012200200013hello.exe!107testfunctiontestfunction(int)40131505001460600158080216120120017140140018AA

I expect to get 我希望得到

hello.exe
61440
...

However, I get just one line of long string. 但是,我只得到一行长字符串。

  • Q1 : What might be wrong? Q1:可能出现什么问题?
  • Q2 : How to get the # of elements in cons? Q2:如何获得缺点中的元素数量? I tried cons.Count , but it doesn't work. 我试过cons.Count ,但它不起作用。
  • Q3 : If I need to get nested value of <CoverageDSPriv><Module><ModuleNmae> I use this code : 问题3:如果我需要获得<CoverageDSPriv><Module><ModuleNmae>嵌套值,我使用以下代码:

    var coveragePriv = doc.Descendants("CoverageDSPriv"); var coveragePriv = doc.Descendants(“CoverageDSPriv”); //.First(); //。第一(); var cons = coveragePriv.Elements("Module").Elements("ModuleName"); var cons = coveragePriv.Elements(“Module”)。Elements(“ModuleName”);

I can live with this, but if the elements are deeply nested, I might be wanting to have direct way to get the elements. 我可以忍受这个,但如果元素是深层嵌套的,我可能想要直接获取元素。 Are there any other ways to do that? 有没有其他方法可以做到这一点?

ADDED 添加

var cons = coveragePriv.Elements("Module").Elements();

solves this issue, but for the NamespaceTable , it again prints out all the elements in one line. 解决了这个问题,但对于NamespaceTable ,它再次打印出一行中的所有元素。

hello.exe
61440
0
8
0
1
6
1
61810hello.exehello.exehello.exe81061hello.exehello.exe!17main_main40030170170010180180011190190012200200013hello.exe!107testfunctiontestfunction(int)40131505001460600158080216120120017140140018

Or, Linq to XML can be a better solution, as this post . 或者,Linq to XML可以成为更好的解决方案,就像这篇文章一样

It looks to me like you only have one element named Module -- so .Value is simply returning you the InnerText of that entire element. 在我看来,你只有一个名为Module元素 - 所以.Value只是返回整个元素的InnerText。 Were you intending this instead? 你想要这个吗?

coveragePriv.Element("Module").Elements();

This would return all the child elements of the Module element, which seems to be what your'e after. 这将返回Module元素的所有子元素,这似乎是你所追求的。

Update: 更新:

<NamespaceTable> is a child of <Module> but you appear to want to handle it similarly to <Module> in that you want to write out each child element. <NamespaceTable><Module>的子<Module>但您似乎希望像<Module>一样处理它,因为您想要写出每个子元素。 Thus, one brute-force approach would be to add another loop for <NamespaceTable> : 因此,一种强力方法是为<NamespaceTable>添加另一个循环:

foreach (var con in cons)
{
    if (con.Name == "NamespaceTable") 
    {
        foreach (var nsElement in con.Elements()) 
        {
            var nsId = nsElement.Value;
            Console.WriteLine(nsId);
        }
    }
    else
    {
        var id = con.Value;
        Console.WriteLine(id);
    }
}

Alternatively, perhaps you'd rather just denormalize them altogether via .Descendents() : 或者,也许您只需通过.Descendents()完全对它们进行.Descendents()规范化:

var cons = coveragePriv.Element("Module").Descendents();

foreach (var con in cons)
{
    var id = con.Value;
    Console.WriteLine(id);
}

XMLElement.Value has unexpected results. XMLElement.Value具有意外结果。 In XML using .net you are really in charge of manually traversing the xml tree. 在使用.net的XML中,您实际上负责手动遍历xml树。 If the element is text then value may return what you want but if its another element then not so much. 如果元素是文本,则值可以返回您想要的内容但是如果它的另一个元素则不是那么多。

I have done a lot of xml parsing and I find there are way better ways to handle XML depending on what you are doing with the data. 我已经做了很多xml解析,我发现有更好的方法来处理XML,这取决于你对数据的处理方式。

1) You can look into XSLT transforms if you plan on outputting this data as text, more xml, or html. 1)如果您计划将此数据输出为text,more xml或html,则可以查看XSLT转换。 This is a great way to convert the data to some other readable format. 这是将数据转换为其他可读格式的好方法。 We use this when we want to display our metadata on our website in html. 当我们想在html中显示我们网站上的元数据时,我们使用它。

2) Look into XML Serialization. 2)研究XML序列化。 C# makes this very easy and it is amazing to use because then you can work with a regular C# object when consuming the data. C#使这非常简单,使用起来非常棒,因为在使用数据时你可以使用常规的C#对象。 MS even has tools to create the serlization class from the XML. MS甚至还有从XML创建serlization类的工具。 I usually start with that, clean it up and add my own tweaks to make it work as I wish. 我通常从那开始,清理它并添加我自己的调整以使其按照我的意愿工作。 The best way is to deserialize the object to XML and see if that matches what you have. 最好的方法是将对象反序列化为XML,并查看它是否与您拥有的相匹配。

3) Try Linq to XML. 3)尝试Linq to XML。 It will allow you to query the XML as if it were a database. 它将允许您查询XML,就像它是一个数据库一样。 It is a little slower generally but unless you need absolute performance it works very well for minimizing your work. 它通常会慢一些,但除非你需要绝对的性能,否则它可以很好地减少你的工作量。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM