Python使用xml.dom.minidom解析XML-提取列表中的项目

Question

I have lengthy xml this actually ebay listings using ebay api, I am trying to extract following structure in that xml dom: 我在使用ebay api的ebay列表中有很长的xml，我试图在该xml dom中提取以下结构：

I am only putting the segment that I am having trouble with, please let me know if you need to see the entire file, I could upload it to a location or do an attachment as a picture. 我只是将遇到问题的片段放进去，如果您需要查看整个文件，请告诉我，我可以将其上传到某个位置或做为图片附件。

<ItemSpecifics>
<NameValueList>
<Name>Room</Name>
<Value>Living Room</Value>
</NameValueList>
<NameValueList>
<Name>Type</Name>
<Value>Sofa Set</Value>
</NameValueList>
<NameValueList>...</NameValueList>
<NameValueList>
<Name>Upholstery Fabric</Name>
<Value>Microfiber</Value>
</NameValueList>
<NameValueList>
<Name>Color</Name>
<Value>Beiges</Value>
</NameValueList>
<NameValueList>
<Name>Style</Name>
<Value>Contemporary</Value>
</NameValueList>
<NameValueList>
<Name>MPN</Name>
<Value>F7615, F7616, F7617, F7618, F7619, F7620</Value>
</NameValueList>
</ItemSpecifics>

Here is dom structure for another ebay item: 这是另一个eBay产品的dom结构：

ItemSpecifics>
<NameValueList>
<Name>Brand</Name>
<Value>Nikon</Value>
</NameValueList>
<NameValueList>
<Name>Model</Name>
<Value>D3100</Value>
</NameValueList>
<NameValueList>
<Name>MPN</Name>
<Value>9798</Value>
</NameValueList>
<NameValueList>
<Name>Type</Name>
<Value>Digital SLR</Value>
</NameValueList>
<NameValueList>
<Name>Megapixels</Name>
<Value>14.2 MP</Value>
</NameValueList>
<NameValueList>
<Name>Optical Zoom</Name>
<Value>3.1x</Value>
</NameValueList>
<NameValueList>
<Name>Screen Size</Name>
<Value>3"</Value>
</NameValueList>
<NameValueList>
<Name>Color</Name>
<Value>Black</Value>
</NameValueList>
</ItemSpecifics>

But when I tried to extract above elements I endup getting following error: 但是当我尝试提取上述元素时，我最终遇到以下错误：

   attID=att.attributes.getNamedItem('Name').nodeValue
AttributeError: 'NoneType' object has no attribute 'nodeValue'

this is what I get right after I parse response: 这是我解析响应后得到的结果：

[<DOM Element: NameValueList at 0x103398878>, <DOM Element: NameValueList at 0x103398ab8>, <DOM Element: NameValueList at 0x103398cf8>, <DOM Element: NameValueList at 0x103398f38>, <DOM Element: NameValueList at 0x1033b31b8>, <DOM Element: NameValueList at 0x1033b33f8>, <DOM Element: NameValueList at 0x1033b3638>, <DOM Element: NameValueList at 0x1033b3878>]

And this is what I get inside my for loop before getting the error: 这是我在收到错误之前进入for循环的内容：

<DOM Element: NameValueList at 0x103398878>

Here is my code: 这是我的代码：

  results = {}
  attributeSet=response.getElementsByTagName('NameValueList')
  print attributeSet
  attributes={}
  for att in attributeSet:
    print att
    attID=att.attributes.getNamedItem('Name').nodeValue
    attValue=getSingleValue(att,'Value')
    attributes[attID]=attValue
  result['attributes']=attributes
  return result

This is my xml request method: 这是我的xml请求方法：

def sendRequest(apicall,xmlparameters):
  connection = httplib.HTTPSConnection(serverUrl)
  connection.request("POST", '/ws/api.dll', xmlparameters, getHeaders(apicall))
  response = connection.getresponse()
  if response.status != 200:
    print "Error sending request:" + response.reason
  else: 
    data = response.read()
    connection.close()
  return data

Answer 1

attributes.getNamedItem() gives you the attributes of an element, not it's children, and a <NameValueList> element has no Name attribute, only <Name> elements. attributes.getNamedItem()为您提供元素的属性，而不是子元素，并且<NameValueList>元素没有Name属性，只有<Name>元素。 You'd have to loop over the contained elements of <NameValueList> , or use .getElementsByTagName('Name') and .getElementsByTagName('Value') to get individual sub-nodes. 您必须遍历<NameValueList>的包含元素，或使用.getElementsByTagName('Name')和.getElementsByTagName('Value')来获取单个子节点。

Do yourself a big favour though and use the ElementTree API instead; 不过，请帮自己一个大忙，改用ElementTree API ； that API is far pythononic and easier to use than the XML DOM API: 与XML DOM API相比，该API具有很强的Python风格，并且更易于使用：

from xml.etree import ElementTree as ET

etree = ET.fromstring(data)
results = {}
for nvl in etree.findall('NameValueList'):
    name = nvl.find('Name').text
    value = nvl.find('Value').text
    results[name] = value

Python使用xml.dom.minidom解析XML-提取列表中的项目

问题描述

1 个解决方案

解决方案1
3 已采纳 2012-08-21 18:14:52

Python使用xml.dom.minidom解析XML-提取列表中的项目

问题描述

1 个解决方案

解决方案1 3 已采纳 2012-08-21 18:14:52

解决方案1
3 已采纳 2012-08-21 18:14:52