[英]Parsing an XML file -options?
I'm developing a system to pick up XML attachments from emails, via Exchange Web Services, and enter them into a DB, via a custom DAL object that I've created. 我正在开发一个系统,以通过Exchange Web Services从电子邮件中提取XML附件,并通过我创建的自定义DAL对象将其输入数据库。
I've manage to extract the XML attachment and have it ready as a stream... they question is how to parse this stream and populate a DAL object. 我已经设法提取XML附件并将其作为流准备好了……他们质疑如何解析此流并填充DAL对象。
I can create an XMLTextReader and iterate through each element. 我可以创建一个XMLTextReader并遍历每个元素。 I don't see any problems with this other than that I suspect there is a much slicker way.
除了我怀疑这是一种更加流畅的方式之外,我没有看到其他任何问题。 The reader seems to treat the opening tag, the content of the tag and the closing tag as different elements (using reader.NodeType).
读者似乎将开始标签,标签内容和结束标签视为不同的元素(使用reader.NodeType)。 I expected myValue to be considered one element rather than three.
我希望myValue被认为是一个元素,而不是三个。 Like I said, I can get round this problem, but I'm sure there must be a better way.
就像我说的,我可以解决这个问题,但是我敢肯定必须有更好的方法。
I came across the idea of using an XML Serializer (completely new to me) but a quick look suggested that these can't handle ArrayLists and List (I'm using List). 我遇到了使用XML序列化器(对我来说是全新的想法)的想法,但是快速浏览后发现它们不能处理ArrayLists和List(我正在使用List)。
Again, I'm new to LINQ, but LINQ-to-XML has also been mentioned, but examples I've seen seem rather complex - though that my simply be my lack of familiarity. 再说一次,我是LINQ的新手,但也提到了LINQ-to-XML,但是我看到的例子似乎相当复杂-尽管我只是不熟悉。
Basically, I don't want a cludged system, but I don't want to use any complicated technique with a learning curve, just because it's 'cool'. 基本上,我不需要笨拙的系统,但是我不想因为学习曲线太复杂而使用任何复杂的技术。
What is the simplest and most effective way of translating this XML/Stream in to my DAL objects? 将这个XML / Stream转换成我的DAL对象的最简单,最有效的方法是什么?
XML Sample: XML示例:
<?xml version="1.0" encoding="UTF-8"?>
<enquiry>
<enquiryno>100001</enquiryno>
<companyname>myco</companyname>
<typeofbusiness>dunno</typeofbusiness>
<companyregno>ABC123</companyregno>
<postcode>12345</postcode>
<contactemail>me@example.com</contactemail>
<firstname>My</firstname>
<lastname>Name</lastname>
<vehicles>
<vehicle>
<vehiclereg>54321</vehiclereg>
<vehicletype>Car</vehicletype>
<vehiclemake>Ford</vehiclemake>
<cabtype>n/a</cabtype>
<powerbhp>130</powerbhp>
<registrationdate>01/01/2003</registrationdate>
</vehicle>
</vehicles>
</enquiry>
Update 1 : I'm trying to deserialize, based on Graham's example. 更新1 :基于Graham的示例,我正在尝试反序列化。 I think I've set up the DAL for serialization, including specifying
[XmlElement("whatever")]
for each property. 我想我已经设置了DAL进行序列化,包括为每个属性指定
[XmlElement("whatever")]
。 And I've tried to deserialize using the following: 并且我尝试使用以下方法反序列化:
SalesEnquiry enquiry = null;
XmlSerializer serializer = new XmlSerializer(typeof(SalesEnquiry));
enquiry = (SalesEnquiry)serializer.Deserialize(stream);
However, I get an exception:' There is an error in XML document (2, 2)
'. 但是,我得到一个例外:“
There is an error in XML document (2, 2)
”。 The innerexception states {"<enquiry xmlns=''> was not expected."}
内部异常状态
{"<enquiry xmlns=''> was not expected."}
Conclusion (updated): 结论 (更新):
My previous problem was the fact that the element in the XML file (Enquiry) != the name of the class (SalesEnquiry). 我以前的问题是XML文件中的元素(Enquiry)!=类的名称(SalesEnquiry)。 Rather than an
[XmlElement]
attribute for the class, we need an [XmlRoot]
attribute instead. 而不是该类的
[XmlElement]
属性,我们需要一个[XmlRoot]
属性。 For completeness, if you want a property in your class to be ignored during serialization, you use the [XmlIgnore]
attribute. 为了完整起见,如果希望在序列化过程中忽略类中的属性,请使用
[XmlIgnore]
属性。
I've successfully serialized my object, and have now successfully taken the incoming XML and de-serialized it into a SalesEnquiry object. 我已经成功地序列化了我的对象,现在已经成功获取了传入的XML并将其反序列化为SalesEnquiry对象。
This approach is far easier than manually parsing the XML. 这种方法比手动解析XML容易得多。 OK, there has been a steep learning curve, but it was worth it.
好的,学习曲线很陡,但这是值得的。
Thanks! 谢谢!
If your XML uses a schema (ie you're always going to know what elements appear, and where they appear in the tree), you could use XmlSerializer
to create your objects. 如果您的XML使用模式(即,您总是要知道哪些元素出现以及它们在树中的位置),则可以使用
XmlSerializer
创建对象。 You'd just need some attributes on your classes to tell the serializer what XML elements or attributes they correspond to. 您只需要在类上具有一些属性即可告诉序列化程序它们对应的XML元素或属性。 Then you just load up your XML, create a new
XmlSerializer
with the type of the .NET object you want to create, and call the Deserialize
method. 然后,您只需加载XML,使用您要创建的.NET对象的类型创建一个新的
XmlSerializer
,然后调用Deserialize
方法。
For example, you have a class like this: 例如,您有一个类似的类:
[Serializable]
public class Person
{
[XmlElement("PersonName")]
public string Name { get; set; }
[XmlElement("PersonAge")]
public int Age { get; set; }
[XmlArrayItem("Child")]
public List<string> Children { get; set; }
}
And input XML like this (saved in a file for this example): 然后输入这样的XML(在此示例中保存在文件中):
<?xml version="1.0"?>
<Person>
<PersonName>Bob</PersonName>
<PersonAge>35</PersonAge>
<Children>
<Child>Chris</Child>
<Child>Alice</Child>
</Children>
</Person>
Then you create a Person
instance like this: 然后您创建一个
Person
实例,如下所示:
Person person = null;
XmlSerializer serializer = new XmlSerializer(typeof(Person));
using (FileStream fs = new FileStream(GetFileName(), FileMode.Open))
{
person = (Person)serializer.Deserialize(fs);
}
Update: Based on your last update, I would guess that either you need to specify an XmlRoot
attribute on the class that's acting as your root element (ie SalesEnquiry
), or the XmlSerializer
might be a bit confused that you're referencing an empty namespace in your XML ( xmlns=''
doesn't seem right). 更新:根据您的上一次更新,我想您可能需要在充当根元素的类上指定
XmlRoot
属性(即SalesEnquiry
),否则XmlSerializer
可能有点困惑,因为您引用的是空名称空间在您的XML中( xmlns=''
似乎不正确)。
XmlSerializer确实支持数组和列表...,只要所包含的类型可序列化即可。
I have found Xsd2Code very helpful for this kind of thing: http://xsd2code.codeplex.com/ 我发现Xsd2Code对于这种事情非常有帮助: http : //xsd2code.codeplex.com/
Basically, all you need to do is write an xsd file (an XML schema file) and specify a few command line switches. 基本上,您所需要做的就是编写一个xsd文件(一个XML模式文件)并指定一些命令行开关。 Xsd2Code will automatically generate a C# class file that contains all the classes and properties plus everything needed to handle the serialization.
Xsd2Code将自动生成一个C#类文件,其中包含所有类和属性以及处理序列化所需的一切。 It's not a perfect solution as it doesn't support all aspects of XSD, but if your XML files are relatively simple collections of elements and attributes, it should be a nice short-cut for you.
这不是一个完美的解决方案,因为它不支持XSD的所有方面,但是如果您的XML文件是相对简单的元素和属性集合,那么它对您来说应该是一个不错的捷径。
There's another similar project on Codeplex called Linq to XSD ( http://linqtoxsd.codeplex.com/ ), which was designed to enforce the entire XSD specification, but last time I checked, it was no longer being supported and not really ready for prime time. 在Codeplex上还有另一个类似的项目,称为Linq to XSD( http://linqtoxsd.codeplex.com/ ),该项目旨在实施整个XSD规范,但是上一次我检查时,该项目不再受支持并且还没有真正准备好黄金时间。 Thought it was worth a mention, though.
认为值得一提。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.