[英]Iterate through all elements of XML file
我有一个像这样的XML文件:
<CustomerOrders>
<Customers>
<CustomerID>ALFKI</CustomerID>
<Orders>
<OrderID>10643</OrderID>
<CustomerID>ALFKI</CustomerID>
<OrderDate>1997-08-25</OrderDate>
</Orders>
<Orders>
<OrderID>10692</OrderID>
<CustomerID>ALFKI</CustomerID>
<OrderDate>1997-10-03</OrderDate>
</Orders>
<CompanyName>Alfreds Futterkiste</CompanyName>
</Customers>
<Customers>
<CustomerID>ANATR</CustomerID>
<Orders>
<OrderID>10308</OrderID>
<CustomerID>ANATR</CustomerID>
<OrderDate>1996-09-18</OrderDate>
</Orders>
<CompanyName>Ana Trujillo Emparedados y helados</CompanyName>
</Customers>
</CustomerOrders>
我想提取每个元素以转换为小写。 我知道我可以递归遍历所有节点和子节点,但是我在输出实际元素方面很费劲。
现在在我的代码中,我只是打印所有标签及其属性,还可以手动打印出元素
import xml.etree.ElementTree as ET
tree = ET.parse('customer.xml')
root = tree.getroot()
for descendant in root.findall(".//*"):
print descendant.tag, descendant.attrib
print root[0][1][0].text #prints 10643
我想要的是能够打印出文件的每个元素,并将它们全部转换为小写。
预期产量:
CustomerID = alfki
OrderID = 10643
CustomerID = alfki
OrderDate = 1997-08025
OrderID = 10692
CustomerID = alfki
OrderDate = 1997-10-03
CompanyName = alfreds futterkiste
等等
我的尝试如下
import lxml.etree as et
s="""
<CustomerOrders>
<Customers>
<CustomerID>ALFKI</CustomerID>
<Orders>
<OrderID>10643</OrderID>
<CustomerID>ALFKI</CustomerID>
<OrderDate>1997-08-25</OrderDate>
</Orders>
<Orders>
<OrderID>10692</OrderID>
<CustomerID>ALFKI</CustomerID>
<OrderDate>1997-10-03</OrderDate>
</Orders>
<CompanyName>Alfreds Futterkiste</CompanyName>
</Customers>
<Customers>
<CustomerID>ANATR</CustomerID>
<Orders>
<OrderID>10308</OrderID>
<CustomerID>ANATR</CustomerID>
<OrderDate>1996-09-18</OrderDate>
</Orders>
<CompanyName>Ana Trujillo Emparedados y helados</CompanyName>
</Customers>
</CustomerOrders>
"""
tree = et.fromstring(s)
for txt in tree.xpath('//text()/parent::*[1]'):
txt.text = "%s"%txt.text.lower()
print et.tostring(tree, pretty_print=True)
它打印-
<CustomerOrders>
<Customers>
<CustomerID>alfki</CustomerID>
<Orders>
<OrderID>10643</OrderID>
<CustomerID>alfki</CustomerID>
<OrderDate>1997-08-25</OrderDate>
</Orders>
<Orders>
<OrderID>10692</OrderID>
<CustomerID>alfki</CustomerID>
<OrderDate>1997-10-03</OrderDate>
</Orders>
<CompanyName>alfreds futterkiste</CompanyName>
</Customers>
<Customers>
<CustomerID>anatr</CustomerID>
<Orders>
<OrderID>10308</OrderID>
<CustomerID>anatr</CustomerID>
<OrderDate>1996-09-18</OrderDate>
</Orders>
<CompanyName>ana trujillo emparedados y helados</CompanyName>
</Customers>
</CustomerOrders>
考虑使用通过translate()
函数使用XSLT 。 作为信息,XSLT是一种专用的编程语言,用于转换,样式化,重新格式化和重新构造XML文档。 您可以避免在Python中跨所有节点和文本进行递归循环。
XSLT脚本(另存为.xsl或.xslt,以包含在Python中)
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:variable name="lowercase" select="'abcdefghijklmnopqrstuvwxyz'" />
<xsl:variable name="uppercase" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ'" />
<!-- Identity Transform -->
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="text()">
<xsl:value-of select="translate(., $uppercase, $lowercase)"/>
</xsl:template>
</xsl:stylesheet>
Python脚本
import lxml.etree as ET
dom = ET.parse('customer.xml'))
xslt = ET.parse('XSLTscript.xsl'))
transform = ET.XSLT(xslt)
newdom = transform(dom)
tree_out = ET.tostring(newdom, encoding='UTF-8', pretty_print=True, xml_declaration=True)
print(tree_out)
xmlfile = open(os.path.join(cd, 'Output.xml'),'wb')
xmlfile.write(tree_out)
xmlfile.close()
输出量
<?xml version='1.0' encoding='UTF-8'?>
<CustomerOrders>
<Customers>
<CustomerID>alfki</CustomerID>
<Orders>
<OrderID>10643</OrderID>
<CustomerID>alfki</CustomerID>
<OrderDate>1997-08-25</OrderDate>
</Orders>
<Orders>
<OrderID>10692</OrderID>
<CustomerID>alfki</CustomerID>
<OrderDate>1997-10-03</OrderDate>
</Orders>
<CompanyName>alfreds futterkiste</CompanyName>
</Customers>
<Customers>
<CustomerID>anatr</CustomerID>
<Orders>
<OrderID>10308</OrderID>
<CustomerID>anatr</CustomerID>
<OrderDate>1996-09-18</OrderDate>
</Orders>
<CompanyName>ana trujillo emparedados y helados</CompanyName>
</Customers>
</CustomerOrders>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.