[英]reading and parsing nested XML tags and save to CSV using Python, Unable to read the nested tag
我查看并尝试了多种方法,但无法读取我的 XML 文件中的嵌套标签。 我提取了外部标签值,而不是地址标签下的嵌套街道和城市标签。 我时间紧迫,尝试了很多东西后无法读取嵌套标签。 请帮忙!!!
我试图得到的预期结果是---->
普通植物区光价街市
bloodroot Sanguinaria canadensis 4 大多阴暗 2.44 1 多伦多
等等---->
但是,我无法检索街道和城市列,因为我的代码没有获取嵌套标签。
通过删除涉及城市和街道标签的代码,我已经能够实现以下 output。
普通植物区光价
bloodroot Sanguinaria canadensis 4 主要是阴暗的 2.44
以下是我的 xml 文件,其中有 2 个条目仅用于测试目的。 我正在尝试在上面提到的植物标签下创建每个文本信息的列。 我正在使用数据块文件系统阅读。 我打开并创建一个 csv 并写入它,然后关闭它。 缩进是正确的,可能是我复制粘贴的时候弄错了。
<?xml version="1.0" encoding="UTF-8"?>
<CATALOG>
<PLANT>
<COMMON>Bloodroot</COMMON>
<BOTANICAL>Sanguinaria canadensis</BOTANICAL>
<ZONE>4</ZONE>
<LIGHT>Mostly Shady</LIGHT>
<PRICE>$2.44</PRICE>
<ADDRESS>
<STREET>1</STREET>
<CITY>toronto</CITY>
</ADDRESS>
<AVAILABILITY>031599</AVAILABILITY>
</PLANT>
<PLANT>
<COMMON>Columbine</COMMON>
<BOTANICAL>Aquilegia canadensis</BOTANICAL>
<ZONE>3</ZONE>
<LIGHT>Mostly Shady</LIGHT>
<PRICE>$9.37</PRICE>
<ADDRESS>
<STREET>2</STREET>
<CITY>montreal</CITY>
</ADDRESS>
<AVAILABILITY>030699</AVAILABILITY>
</PLANT>
</CATALOG>
-----------This is the code I have used ---------------
from xml.etree import ElementTree
import csv
import os
xml = ElementTree.parse("/dbfs/mnt/ods-outbound/xml_test/plant_catalog.xml")
#creating a file
csvfile= open("/dbfs/mnt/ods-outbound/xml_test/plant_catalog.csv",'w',encoding='utf-8')
csvfile_writer=csv.writer(csvfile)
# ADD THE HEADER TO CSV FILE
csvfile_writer.writerow(["common","botanical","zone","light","price","availability","street","city"])
# FOR EACH PLANT
for plant in xml.findall("PLANT"):
if(plant)
# EXTRACT PLANT DETAILS
common = plant.find("COMMON")
botanical = plant.find("BOTANICAL")
zone = plant.find("ZONE")
light = plant.find("LIGHT")
price = plant.find("PRICE")
availability = plant.find("AVAILABILITY")
street = plant.find("STREET")
city = plant.find("CITY")
csv_line = [common.text, botanical.text, zone.text, light.text, price.text, availability.text,street.text,city.text]
# ADD A NEW ROW TO CSV FILE
csvfile_writer.writerow(csv_line)
csvfile.close()
根据 xml 文件,街道和城市值位于地址标签内。
changes
:
a.xml
(输入文件):
<?xml version="1.0" encoding="UTF-8" ?>
<CATALOG>
<PLANT>
<COMMON>Bloodroot</COMMON>
<BOTANICAL>Sanguinaria canadensis</BOTANICAL>
<ZONE>4</ZONE>
<LIGHT>Mostly Shady</LIGHT>
<PRICE>$2.44</PRICE>
<ADDRESS>
<STREET>1</STREET>
<CITY>toronto</CITY>
</ADDRESS>
<AVAILABILITY>031599</AVAILABILITY>
</PLANT>
<PLANT>
<COMMON>Columbine</COMMON>
<BOTANICAL>Aquilegia canadensis</BOTANICAL>
<ZONE>3</ZONE>
<LIGHT>Mostly Shady</LIGHT>
<PRICE>$9.37</PRICE>
<ADDRESS>
<STREET>2</STREET>
<CITY>montreal</CITY>
</ADDRESS>
<AVAILABILITY>030699</AVAILABILITY>
</PLANT>
</CATALOG>
main.py
:
import csv
from xml.etree import ElementTree
xml = ElementTree.parse("a.xml")
csvfile= open("plant_catalog.csv",'w',encoding='utf-8',newline="")
csvfile_writer=csv.writer(csvfile)
csvfile_writer.writerow(["common","botanical","zone","light","price","availability","street","city"])
for plant in xml.findall("PLANT"):
if(plant):
common = plant.find("COMMON").text
botanical = plant.find("BOTANICAL").text
zone = plant.find("ZONE").text
light = plant.find("LIGHT").text
price = plant.find("PRICE").text
availability = plant.find("AVAILABILITY").text
for addr in plant.findall("ADDRESS"):
if(addr):
street = addr.find("STREET").text
city = addr.find("CITY").text
csv_line = [common, botanical, zone, light, price, availability,street,city]
csvfile_writer.writerow(csv_line)
csvfile.close()
plant_catalog.csv
(输出文件):
common,botanical,zone,light,price,availability,street,city
Bloodroot,Sanguinaria canadensis,4,Mostly Shady,$2.44,031599,1,toronto
Columbine,Aquilegia canadensis,3,Mostly Shady,$9.37,030699,2,montreal
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.