简体   繁体   English

解析XML文件-options?

[英]Parsing an XML file -options?

I'm developing a system to pick up XML attachments from emails, via Exchange Web Services, and enter them into a DB, via a custom DAL object that I've created. 我正在开发一个系统,以通过Exchange Web Services从电子邮件中提取XML附件,并通过我创建的自定义DAL对象将其输入数据库。

I've manage to extract the XML attachment and have it ready as a stream... they question is how to parse this stream and populate a DAL object. 我已经设法提取XML附件并将其作为流准备好了……他们质疑如何解析此流并填充DAL对象。

I can create an XMLTextReader and iterate through each element. 我可以创建一个XMLTextReader并遍历每个元素。 I don't see any problems with this other than that I suspect there is a much slicker way. 除了我怀疑这是一种更加流畅的方式之外,我没有看到其他任何问题。 The reader seems to treat the opening tag, the content of the tag and the closing tag as different elements (using reader.NodeType). 读者似乎将开始标签,标签内容和结束标签视为不同的元素(使用reader.NodeType)。 I expected myValue to be considered one element rather than three. 我希望myValue被认为是一个元素,而不是三个。 Like I said, I can get round this problem, but I'm sure there must be a better way. 就像我说的,我可以解决这个问题,但是我敢肯定必须有更好的方法。

I came across the idea of using an XML Serializer (completely new to me) but a quick look suggested that these can't handle ArrayLists and List (I'm using List). 我遇到了使用XML序列化器(对我来说是全新的想法)的想法,但是快速浏览后发现它们不能处理ArrayLists和List(我正在使用List)。

Again, I'm new to LINQ, but LINQ-to-XML has also been mentioned, but examples I've seen seem rather complex - though that my simply be my lack of familiarity. 再说一次,我是LINQ的新手,但也提到了LINQ-to-XML,但是我看到的例子似乎相当复杂-尽管我只是不熟悉。

Basically, I don't want a cludged system, but I don't want to use any complicated technique with a learning curve, just because it's 'cool'. 基本上,我不需要笨拙的系统,但是我不想因为学习曲线太复杂而使用任何复杂的技术。

What is the simplest and most effective way of translating this XML/Stream in to my DAL objects? 将这个XML / Stream转换成我的DAL对象的最简单,最有效的方法是什么?

XML Sample: XML示例:

<?xml version="1.0" encoding="UTF-8"?>
<enquiry>
    <enquiryno>100001</enquiryno>
    <companyname>myco</companyname>
    <typeofbusiness>dunno</typeofbusiness>
    <companyregno>ABC123</companyregno>
    <postcode>12345</postcode>
    <contactemail>me@example.com</contactemail>
    <firstname>My</firstname>
    <lastname>Name</lastname>
    <vehicles>
        <vehicle>
            <vehiclereg>54321</vehiclereg>
            <vehicletype>Car</vehicletype>
            <vehiclemake>Ford</vehiclemake>
            <cabtype>n/a</cabtype>
            <powerbhp>130</powerbhp>
            <registrationdate>01/01/2003</registrationdate>
        </vehicle>
    </vehicles>
</enquiry>

Update 1 : I'm trying to deserialize, based on Graham's example. 更新1 :基于Graham的示例,我正在尝试反序列化。 I think I've set up the DAL for serialization, including specifying [XmlElement("whatever")] for each property. 我已经设置了DAL进行序列化,包括为每个属性指定[XmlElement("whatever")] And I've tried to deserialize using the following: 并且我尝试使用以下方法反序列化:

SalesEnquiry enquiry = null;
XmlSerializer serializer = new XmlSerializer(typeof(SalesEnquiry));
enquiry = (SalesEnquiry)serializer.Deserialize(stream);

However, I get an exception:' There is an error in XML document (2, 2) '. 但是,我得到一个例外:“ There is an error in XML document (2, 2) ”。 The innerexception states {"<enquiry xmlns=''> was not expected."} 内部异常状态{"<enquiry xmlns=''> was not expected."}

Conclusion (updated): 结论 (更新):

My previous problem was the fact that the element in the XML file (Enquiry) != the name of the class (SalesEnquiry). 我以前的问题是XML文件中的元素(Enquiry)!=类的名称(SalesEnquiry)。 Rather than an [XmlElement] attribute for the class, we need an [XmlRoot] attribute instead. 而不是该类的[XmlElement]属性,我们需要一个[XmlRoot]属性。 For completeness, if you want a property in your class to be ignored during serialization, you use the [XmlIgnore] attribute. 为了完整起见,如果希望在序列化过程中忽略类中的属性,请使用[XmlIgnore]属性。

I've successfully serialized my object, and have now successfully taken the incoming XML and de-serialized it into a SalesEnquiry object. 我已经成功地序列化了我的对象,现在已经成功获取了传入的XML并将其反序列化为SalesEnquiry对象。

This approach is far easier than manually parsing the XML. 这种方法比手动解析XML容易得多。 OK, there has been a steep learning curve, but it was worth it. 好的,学习曲线很陡,但这是值得的。

Thanks! 谢谢!

If your XML uses a schema (ie you're always going to know what elements appear, and where they appear in the tree), you could use XmlSerializer to create your objects. 如果您的XML使用模式(即,您总是要知道哪些元素出现以及它们在树中的位置),则可以使用XmlSerializer创建对象。 You'd just need some attributes on your classes to tell the serializer what XML elements or attributes they correspond to. 您只需要在类上具有一些属性即可告诉序列化程序它们对应的XML元素或属性。 Then you just load up your XML, create a new XmlSerializer with the type of the .NET object you want to create, and call the Deserialize method. 然后,您只需加载XML,使用您要创建的.NET对象的类型创建一个新的XmlSerializer ,然后调用Deserialize方法。

For example, you have a class like this: 例如,您有一个类似的类:

[Serializable]
public class Person
{
    [XmlElement("PersonName")]
    public string Name { get; set; }

    [XmlElement("PersonAge")]
    public int Age { get; set; }

    [XmlArrayItem("Child")]
    public List<string> Children { get; set; }
}

And input XML like this (saved in a file for this example): 然后输入这样的XML(在此示例中保存在文件中):

<?xml version="1.0"?>
<Person>
  <PersonName>Bob</PersonName>
  <PersonAge>35</PersonAge>
  <Children>
    <Child>Chris</Child>
    <Child>Alice</Child>
  </Children>
</Person>

Then you create a Person instance like this: 然后您创建一个Person实例,如下所示:

Person person = null;
XmlSerializer serializer = new XmlSerializer(typeof(Person));
using (FileStream fs = new FileStream(GetFileName(), FileMode.Open))
{
    person = (Person)serializer.Deserialize(fs);
}

Update: Based on your last update, I would guess that either you need to specify an XmlRoot attribute on the class that's acting as your root element (ie SalesEnquiry ), or the XmlSerializer might be a bit confused that you're referencing an empty namespace in your XML ( xmlns='' doesn't seem right). 更新:根据您的上一次更新,我想您可能需要在充当根元素的类上指定XmlRoot属性(即SalesEnquiry ),否则XmlSerializer可能有点困惑,因为您引用的是空名称空间在您的XML中( xmlns=''似乎不正确)。

XmlSerializer确实支持数组和列表...,只要所包含的类型可序列化即可。

I have found Xsd2Code very helpful for this kind of thing: http://xsd2code.codeplex.com/ 我发现Xsd2Code对于这种事情非常有帮助: http : //xsd2code.codeplex.com/

Basically, all you need to do is write an xsd file (an XML schema file) and specify a few command line switches. 基本上,您所需要做的就是编写一个xsd文件(一个XML模式文件)并指定一些命令行开关。 Xsd2Code will automatically generate a C# class file that contains all the classes and properties plus everything needed to handle the serialization. Xsd2Code将自动生成一个C#类文件,其中包含所有类和属性以及处理序列化所需的一切。 It's not a perfect solution as it doesn't support all aspects of XSD, but if your XML files are relatively simple collections of elements and attributes, it should be a nice short-cut for you. 这不是一个完美的解决方案,因为它不支持XSD的所有方面,但是如果您的XML文件是相对简单的元素和属性集合,那么它对您来说应该是一个不错的捷径。

There's another similar project on Codeplex called Linq to XSD ( http://linqtoxsd.codeplex.com/ ), which was designed to enforce the entire XSD specification, but last time I checked, it was no longer being supported and not really ready for prime time. 在Codeplex上还有另一个类似的项目,称为Linq to XSD( http://linqtoxsd.codeplex.com/ ),该项目旨在实施整个XSD规范,但是上一次我检查时,该项目不再受支持并且还没有真正准备好黄金时间。 Thought it was worth a mention, though. 认为值得一提。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM