python xml-数据帧的多个子节点

Question

Hi Guys I have the following XML 嗨，大家好，我有以下XML

<Batch>
<Date></Date>
<Customer>
<CustType>1</CustType>
<CustomerId>123</CustomerId>
<Address>1 abc st</Address>
<Letters>
<Letter>
<LetterId>123456</LetterId>
<LetterDate>1/1/2000</LetterDate>
</Letter>
<LetterId>98765</LetterId>
<Letter>
<LetterId>5675</LetterId>
<LetterDate>1/1/2010</LetterDate>
</Letter>
</Letters>
</Customer>
</Batch>

As you can see, each customer has multiple letters and I need to get all this in a data frame like below. 如您所见，每个客户都有多个字母，我需要将所有这些都放在如下数据框中。

CustomerID | 客户编号| LetterId | LetterId |
123 123456 123 123456
123 5675 123 5675

I have tried pythons element tree libary as follows. 我已经尝试了pythons元素树库如下。

import xml.etree.ElementTree as ETree  
u = open(filename)  
doc = parse(u)  
    for item in doc.iterfind('Customer'):  
        dict1 = {}  
        dict1['address'] = item.findtext('address')

        list2.append(dict1)

But when I try to get each of the letters I can't join each letter back to the parent node data. 但是，当我尝试获取每个字母时，无法将每个字母重新连接到父节点数据。

I can either get the customer data, or all the letters but not both. 我可以获取客户数据，也可以获取所有字母，但不能同时获取两者。

I need to be able to get a duplicate record of each customer as per the number of letters 我需要能够根据字母数获得每个客户的重复记录

thanks. 谢谢。

Answer 1

I have actually worked this out. 我实际上已经解决了。

from lxml import etree
doc = etree.parse(filename)
root = doc.getroot()

list1 = []
for i in range(len(root.xpath('/Customer/Letters/Letter'))):
   dict1 = {}
   record = root.xpath('/Customer/Letters/Letter')[i]
   for ancestor in record.iterancestors('Customer'):
      dict1['LetterID'] = record.findtext('LetterID')
      dict1['CustomerID'] = ancestor.findtext('CustomerId')
      List1.append(dict1)

Hopefully this helps somebody else 希望这可以帮助其他人

python xml-数据帧的多个子节点

问题描述

1 个解决方案

解决方案1
0 2016-07-05 03:36:10

python xml-数据帧的多个子节点

问题描述

1 个解决方案

解决方案1 0 2016-07-05 03:36:10

解决方案1
0 2016-07-05 03:36:10