简体   繁体   English

使用带有lxml的嵌套命名空间解析XML属性

[英]Parsing XML attributes with nested namespaces with lxml

I want to parse the country codes attributes in RankByCountry. 我想解析RankByCountry中的国家/地区代码属性。 How can I do it? 我该怎么做?

means- print a list ['GB', 'US', 'O'] 表示 - 打印清单['GB','US','O']

<aws:UrlInfoResponse xmlns:aws="http://alexa.amazonaws.com/doc/2005-10-05/"><aws:Response xmlns:aws="http://awis.amazonaws.com/doc/2005-07-11"><aws:OperationRequest><aws:RequestId>122bfdc6-ae8e-d2a2-580e-3841ab33b966</aws:RequestId></aws:OperationRequest><aws:UrlInfoResult><aws:Alexa>
 <aws:TrafficData>
  <aws:DataUrl type="canonical">androidjones.com/</aws:DataUrl>
  <aws:RankByCountry>
    <aws:Country Code="GB">
      <aws:Rank>80725</aws:Rank>
      <aws:Contribution>
        <aws:PageViews>30.6%</aws:PageViews>
        <aws:Users>41.3%</aws:Users>
      </aws:Contribution>
    </aws:Country>
    <aws:Country Code="US">
      <aws:Rank>354356</aws:Rank>
      <aws:Contribution>
        <aws:PageViews>39.1%</aws:PageViews>
        <aws:Users>28.9%</aws:Users>
      </aws:Contribution>
    </aws:Country>
    <aws:Country Code="O">
      <aws:Rank/>
      <aws:Contribution>
        <aws:PageViews>30.2%</aws:PageViews>
        <aws:Users>29.8%</aws:Users>
      </aws:Contribution>
    </aws:Country>
  </aws:RankByCountry>
 </aws:TrafficData>
</aws:Alexa></aws:UrlInfoResult><aws:ResponseStatus xmlns:aws="http://alexa.amazonaws.com/doc/2005-10-05/"><aws:StatusCode>Success</aws:StatusCode></aws:ResponseStatus></aws:Response></aws:UrlInfoResponse>

already tried-- 已经尝试过 -

namespaces = {"aws": "http://awis.amazonaws.com/doc/2005-07-11"}
RankByCountry = tree.xpath("//aws:Country/Code", namespaces=namespaces)

but no luck. 但没有运气。

and also: 并且:

for country in tree.xpath('//Country'):
   for attrib in country.attrib:
      print '@' + attrib + '=' + country.attrib[attrib]

The document looks weird, since it is using the aws namespace prefix twice. 该文档看起来很奇怪,因为它使用两次aws命名空间前缀。 You need to use the more specific namespace, since this overwrites the global namespace with prefix aws . 您需要使用更具体的命名空间,因为这会使用前缀aws覆盖全局命名空间。 Actually you are doing this right. 其实你这样做是对的。

The problem is the xpath expression itself, it should look like this: 问题是xpath表达式本身,它应该如下所示:

for country in tree.xpath('//aws:RankByCountry/aws:Country/@Code', namespaces=namespaces):
    print(country) 

Note that <aws:RankByCountry> has no Code attribute, but <aws:Country> has. 请注意, <aws:RankByCountry>没有Code属性,但<aws:Country>具有。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM