简体   繁体   English

xml在python中使用熊猫读取

[英]xml read using panda in python

i have a xml file.i trying to read it in a usual way as shown below 我有一个xml file.i试图以通常的方式读取它,如下所示

def xmlfilereadread(self,path):
    doc = minidom.parse(path)
    Account = doc.getElementsByTagName("sf:ReceiverSet")[0]
    num = Account.getAttribute('totalNo')
    aList = []
    for i in range(int(num)):
        print(i)
        AccountReference = doc.getElementsByTagName("sf:Receiver")[i] 

but i need to use panda unstead of this code.how can i read data.my sample xml code is 但是我需要使用熊猫代替此代码。我如何读取数据。我的示例xml代码是

<?xml version="1.0" encoding="UTF-8"?>
<sf:IFile xmlns:sf="http://www.canadapost.ca/smartflow" sequenceNo="10">   
<sf:ReceiverSet documentTypes="TAXBILL" organization="lincolntax" totalNo="3">  
<sf:Receiver sequenceNo="1" correlationID="1114567890123456789">   
<sf:AccountReference>11145678901234567891111</sf:AccountReference>   
<sf:SubscriptionAuth> <sf:ParamSet>   
<sf:Param name="auth1">1114567890123456789</sf:Param>   
<sf:Param name="auth2">CARTER, JOE</sf:Param> </sf:ParamSet>   
</sf:SubscriptionAuth>  
</sf:Receiver> <sf:Receiver sequenceNo="2" correlationID="2224567890123456789">   
<sf:AccountReference>22245678901234567892222</sf:AccountReference> <sf:SubscriptionAuth> <sf:ParamSet>  
<sf:Param name="auth1">2224567890123456789</sf:Param>   
<sf:Param name="auth2">DOE, JANE</sf:Param> </sf:ParamSet>   
</sf:SubscriptionAuth> </sf:Receiver> <sf:Receiver sequenceNo="3" correlationID="3334567890123456789">   
<sf:AccountReference>33345678901234567893333</sf:AccountReference> <sf:SubscriptionAuth> <sf:ParamSet>  
<sf:Param name="auth1">3334567890123456789</sf:Param> <sf:Param name="auth2">SOZE, KEYSER</sf:Param>  
</sf:ParamSet> </sf:SubscriptionAuth> </sf:Receiver> </sf:ReceiverSet> </sf:IFile>

XML is an inherently hierarchical data format, and the most natural way to represent it is with a tree. XML是一种固有的分层数据格式,最自然的表示方法是使用树。 ET has two classes for this purpose - ElementTree represents the whole XML document as a tree, and Element represents a single node in this tree. ET为此有两个类-ElementTree将整个XML文档表示为一棵树,而Element表示此树中的单个节点。 Interactions with the whole document (reading and writing to/from files) are usually done on the ElementTree level. 与整个文档的交互(读写文件)通常在ElementTree级别上进行。 Interactions with a single XML element and its sub-elements are done on the Element level 与单个XML元素及其子元素的交互在Element级别完成

.

import xml.etree.ElementTree as ET
tree = ET.parse('country_data.xml')
root = tree.getroot()

Or you can use lxml 或者您可以使用lxml

from lxml import etree 从lxml导入etree

root = etree.parse(r'local-path-to-.xml')
print (etree.tostring(root))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM