简体   繁体   English

从具有多个名称空间的XML反序列化选择元素

[英]Deserializing select elements from XML with multiple namespaces

I am trying to deserialize an XML document that has mostly extraneous information. 我正在尝试反序列化大多数包含无关信息的XML文档。 The classes I have created reflect the what information I am trying to pick out of the document. 我创建的类反映了我试图从文档中挑选出哪些信息。

When I actually do the deserialization, the deserialiation appears to be successful but the CraigslistChannel and CraigslistItem variables always end up being null, even though the document clear has those elements in them. 当我实际进行反序列化时,反序列化似乎成功了,但是即使文档清除中包含这些元素,CraigslistChannel和CraigslistItem变量也总是最终为null。

The XML document that I am trying to deserialize can be found here: https://limaohio.craigslist.org/search/ctd?format=rss&s=25 我要反序列化的XML文档可以在以下位置找到: https : //limaohio.craigslist.org/search/ctd?format=rss&s=25

And looks like this: 看起来像这样:

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF
 xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 xmlns="http://purl.org/rss/1.0/"
 xmlns:enc="http://purl.oclc.org/net/rss_2.0/enc#"
 xmlns:ev="http://purl.org/rss/1.0/modules/event/"
 xmlns:content="http://purl.org/rss/1.0/modules/content/"
 xmlns:dcterms="http://purl.org/dc/terms/"
 xmlns:syn="http://purl.org/rss/1.0/modules/syndication/"
 xmlns:dc="http://purl.org/dc/elements/1.1/"
 xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/"
 xmlns:admin="http://webns.net/mvcb/"
>

<channel rdf:about="https://sfbay.craigslist.org/search/sfc/ctd?format=rss">
<title>craigslist SF bay area | cars &#x26; trucks - by dealer search </title>
<link>https://sfbay.craigslist.org/search/sfc/ctd</link>
<description></description>
<dc:language>en-us</dc:language>
<dc:rights>copyright 2016 craiglist</dc:rights>
<dc:publisher>robot@craigslist.org</dc:publisher>
<dc:creator>robot@craigslist.org</dc:creator>
<dc:source>https://sfbay.craigslist.org/search/sfc/ctd?format=rss</dc:source>
<dc:title>craigslist SF bay area | cars &#x26; trucks - by dealer search </dc:title>
<dc:type>Collection</dc:type>
<syn:updateBase>2016-08-29T07:55:51-07:00</syn:updateBase>
<syn:updateFrequency>1</syn:updateFrequency>
<syn:updatePeriod>hourly</syn:updatePeriod>
<items>
 <rdf:Seq>
  <rdf:li rdf:resource="http://sfbay.craigslist.org/sfc/ctd/5755852598.html" />
  <rdf:li rdf:resource="http://sfbay.craigslist.org/sfc/ctd/5755847263.html" />
  <rdf:li rdf:resource="http://sfbay.craigslist.org/sfc/ctd/5755845763.html" />
  <rdf:li rdf:resource="http://sfbay.craigslist.org/sfc/ctd/5755841763.html" />
  <rdf:li rdf:resource="http://sfbay.craigslist.org/sfc/ctd/5755839975.html" />
  <rdf:li rdf:resource="http://sfbay.craigslist.org/sfc/ctd/5755836851.html" />
  <rdf:li rdf:resource="http://sfbay.craigslist.org/sfc/ctd/5755833170.html" />
  <rdf:li rdf:resource="http://sfbay.craigslist.org/sfc/ctd/5755831622.html" />
  <rdf:li rdf:resource="http://sfbay.craigslist.org/sfc/ctd/5755807313.html" />
  <rdf:li rdf:resource="http://sfbay.craigslist.org/sfc/ctd/5755759606.html" />
  <rdf:li rdf:resource="http://sfbay.craigslist.org/sfc/ctd/5755718561.html" />
  <rdf:li rdf:resource="http://sfbay.craigslist.org/sfc/ctd/5755713440.html" />
  <rdf:li rdf:resource="http://sfbay.craigslist.org/sfc/ctd/5755710804.html" />
  <rdf:li rdf:resource="http://sfbay.craigslist.org/sfc/ctd/5755708355.html" />
  <rdf:li rdf:resource="http://sfbay.craigslist.org/sfc/ctd/5755706051.html" />
  <rdf:li rdf:resource="http://sfbay.craigslist.org/sfc/ctd/5755706053.html" />
  <rdf:li rdf:resource="http://sfbay.craigslist.org/sfc/ctd/5755689225.html" />
  <rdf:li rdf:resource="http://sfbay.craigslist.org/sfc/ctd/5755671023.html" />
  <rdf:li rdf:resource="http://sfbay.craigslist.org/sfc/ctd/5755669710.html" />
  <rdf:li rdf:resource="http://sfbay.craigslist.org/sfc/ctd/5755668546.html" />
  <rdf:li rdf:resource="http://sfbay.craigslist.org/sfc/ctd/5755667302.html" />
  <rdf:li rdf:resource="http://sfbay.craigslist.org/sfc/ctd/5755666084.html" />
  <rdf:li rdf:resource="http://sfbay.craigslist.org/sfc/ctd/5755664804.html" />
  <rdf:li rdf:resource="http://sfbay.craigslist.org/sfc/ctd/5755663619.html" />
  <rdf:li rdf:resource="http://sfbay.craigslist.org/sfc/ctd/5755662504.html" />
 </rdf:Seq>
</items>
</channel>
<item rdf:about="http://sfbay.craigslist.org/sfc/ctd/5755852598.html">
<title><![CDATA[2013 *Ford* *Mustang* *2dr Coupe* -$26,899 (Certified Pre-Owned, Financing Available) &#x0024;26899]]></title>
<link>http://sfbay.craigslist.org/sfc/ctd/5755852598.html</link>
<description><![CDATA[*2013* *Ford* *Mustang* *2dr Coupe* - $26,899* (2013* *Ford* *Mustang* *2dr Coupe*) 
Phone: 
 <a href="/fb/sfo/ctd/5755852598" class="showcontact" title="click to show contact info">show contact info</a>

Price: $26,899 
Vehicle Information Overview 
Year: 2013 
Make: Ford 
Model: Mustang 
Trim: 2dr Coupe 
VIN: 1ZVBP8CF0D5271908 
Mileage: 15500 
B [...]]]></description>
<dc:date>2016-08-29T06:53:28-07:00</dc:date>
<dc:language>en-us</dc:language>
<dc:rights>copyright 2016 craiglist</dc:rights>
<dc:source>http://sfbay.craigslist.org/sfc/ctd/5755852598.html</dc:source>
<dc:title><![CDATA[2013 *Ford* *Mustang* *2dr Coupe* -$26,899 (Certified Pre-Owned, Financing Available) &#x0024;26899]]></dc:title>
<dc:type>text</dc:type>
<enc:enclosure resource="https://images.craigslist.org/00404_h6HxiFEMymk_300x300.jpg" type="image/jpeg"/>
<dcterms:issued>2016-08-29T06:53:28-07:00</dcterms:issued>
</item>
</rdf:RDF>

Here is my code: 这是我的代码:

String requestURL = "https://limaohio.craigslist.org/search/ctd?format=rss&s=25";
IHttpWebResponse response = (new HttpRequesterWrapper(4000)).GetWebResponse(requestURL);
if (!String.IsNullOrEmpty(response.HTML))
{
    XmlSerializer serializer = new XmlSerializer(typeof(CraigslistRDF));
    using (TextReader sr = new StringReader(response.HTML))
    {
        CraigslistRDF rss = (CraigslistRDF)serializer.Deserialize(sr);
        if (rss != null && rss.Channel != null)
        {
            var a = 1;
        }
    }
}

And most importantly, here are the classes that I am trying to deserialize with: 最重要的是,以下是我要反序列化的类:

[XmlRoot(ElementName = "RDF", Namespace = "http://www.w3.org/1999/02/22-rdf-syntax-ns#")]
public class CraigslistRDF
{
    [XmlElement("channel")]
    public CraigslistChannel Channel;

    [XmlElement("item")]
    public CraigslistItem[] Items;
}

[XmlRoot("channel")]
public class CraigslistChannel
{
    [XmlAttribute(AttributeName = "about", Namespace = "http://www.w3.org/1999/02/22-rdf-syntax-ns#")]
    public String About;

    [XmlElement("title")]
    public String Title;

    [XmlElement("link")]
    public String Link;

    [XmlElement("description")]
    public String Description;
}

[XmlRoot("item")]
public class CraigslistItem
{
    [XmlAttribute(AttributeName = "about", Namespace = "http://www.w3.org/1999/02/22-rdf-syntax-ns#")]
    public String About;

    [XmlElement("title")]
    public String Title;

    [XmlElement("link")]
    public String Link;

    [XmlElement("description")]
    public String Description;

    [XmlElement(ElementName = "source", Namespace = "http://purl.org/dc/elements/1.1/")]
    public String Source;
}

Does anyone have any insight into why the CraigslistChannel and CraigslistItem properties always end up being null? 有谁知道为什么CraigslistChannel和CraigslistItem属性总是最终为null?

Any help would be greatly appreciated. 任何帮助将不胜感激。

Because your channel and item elements do have a namespace. 因为您的channelitem元素确实具有名称空间。 The default namespace is declared in the root element as http://purl.org/rss/1.0/ , so change your attributes to match this: 默认名称空间在根元素中声明为http://purl.org/rss/1.0/ ,因此请更改属性以使其与此匹配:

[XmlRoot(ElementName = "RDF", Namespace = "http://www.w3.org/1999/02/22-rdf-syntax-ns#")]
public class CraigslistRDF
{
    [XmlElement("channel", Namespace = "http://purl.org/rss/1.0/")]
    public CraigslistChannel Channel;

    [XmlElement("item", Namespace = "http://purl.org/rss/1.0/")]
    public CraigslistItem[] Items;
}

As an aside, the [XmlRoot("item")] and [XmlRoot("channel")] attributes aren't used and can be removed (these classes aren't used as the root). [XmlRoot("item")][XmlRoot("item")][XmlRoot("channel")]属性未使用,可以删除(这些类未用作根)。

Try xml linq 试试xml linq

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;

namespace ConsoleApplication1
{
    class Program
    {
        const string FILEMNAME = @"c:\temp\test.xml";
        static void Main(string[] args)
        {
            XDocument doc = XDocument.Load(FILEMNAME);
            XElement rdf = (XElement)doc.FirstNode;
            XNamespace rdfNs = rdf.GetNamespaceOfPrefix("rdf");
            XNamespace dcNs = rdf.GetNamespaceOfPrefix("dc");
            XNamespace defaultNs = rdf.GetDefaultNamespace();

            CraigslistRDF craigslistRDF = rdf.Elements(defaultNs + "channel").Select(x => new CraigslistRDF
            {
                Channel = new CraigslistChannel() {
                    About = (string)x.Attribute(rdfNs + "about"),
                    Description = (string)x.Elements(defaultNs + "description").FirstOrDefault(),
                    Link =  (string)x.Element(defaultNs + "link"),
                    Title =  (string)x.Element(defaultNs + "title")
                },

            }).FirstOrDefault();

            craigslistRDF.Items = rdf.Elements(defaultNs + "item").Select(x => new CraigslistItem()
            {
                About = (string)x.Attribute(rdfNs + "about"),
                Description = (string)x.Elements(defaultNs + "description").FirstOrDefault(),
                Link = (string)x.Element(defaultNs + "link"),
                Source = (string)x.Element(dcNs + "source"),
                Title = (string)x.Element(defaultNs + "title")
            }).ToArray();

        }
    }

    public class CraigslistRDF
    {
        public CraigslistChannel Channel { get; set; }
        public CraigslistItem[] Items { get; set; }
    }

    public class CraigslistChannel
    {
        public String About { get; set; }
        public String Title { get; set; }
        public String Link { get; set; }
        public String Description { get; set; }
    }

    public class CraigslistItem
    {
        public String About { get; set; }
        public String Title { get; set; }
        public String Link { get; set; }
        public String Description { get; set; }
        public String Source { get; set; }
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM