简体   繁体   English

如何使用 python 将此 XML 文件转换为 CSV?

[英]How to convert this XML file to CSV using python?

I have hereby the attached the xml file I am trying to convert to csv我特此附上我正在尝试转换为 csv 的 xml 文件

<?xml version="1.0"?>
<response>
  <data>
    <shops>
      <shop id="204019">
        <name>Bannockburn</name>
        <status>Open</status>
        <company id="25">Franchise</company>
        <shopAttributes>
          <shopAttribute attrName="shop_OPEN_DATE">2008-07-16</shopAttribute>
          <shopAttribute attrName="CLOSE_DATE"/>
          <shopAttribute attrName="shop_DISTRIBUTION_CTR_GENERAL" startDate="2019-03-19">90</shopAttribute>
          <shopAttribute attrName="shop_DISTRIBUTION_CTR_GENERAL" startDate="1900-01-01" endDate="2019-03-18"/>
        </shopAttributes>
        <addresses>
          <address type="PUBLIC">
            <addressLine1>1211 Half Day Road</addressLine1>
            <addressLine2></addressLine2>
            <city>Bannockburn</city>
            <stateProvince>IL</stateProvince>
            <postalCode>60015</postalCode>
            <country>USA</country>
            <latitude>42.199461</latitude>
            <longitude>-87.860582</longitude>
          </address>
        </addresses>
      </shop>
      <shop id="204020">
        <name>Plainfield - North Plainfield</name>
        <status>Open</status>
        <company id="25">Franchise</company>
        <shopAttributes>
          <shopAttribute attrName="shop_OPEN_DATE">2007-05-18</shopAttribute>
          <shopAttribute attrName="CLOSE_DATE"/>
          <shopAttribute attrName="shop_DISTRIBUTION_CTR_GENERAL" startDate="2019-03-19">90</shopAttribute>
          <shopAttribute attrName="shop_DISTRIBUTION_CTR_GENERAL" startDate="1900-01-01" endDate="2019-03-18"/>
        </shopAttributes>
        <addresses>
          <address type="PUBLIC">
            <addressLine1>12632 IL Route 59</addressLine1>
            <addressLine2>Suite #102</addressLine2>
            <city>Plainfield</city>
            <stateProvince>IL</stateProvince>
            <postalCode>60585</postalCode>
            <country>USA</country>
            <latitude>41.653125</latitude>
            <longitude>-88.204527</longitude>
          </address>
        </addresses>
      </shop>
</shops>
</data>
</response>

this is the xml file I want to convert to csv, can someone help me how to do it in python?这是我想转换为 csv 的 xml 文件,有人可以帮我在 python 中怎么做吗? Below is the code that I tried to use but I haven't really understood how to do it, saw some examples but it isn't very clear下面是我尝试使用的代码,但我还没有真正理解如何去做,看到了一些例子,但不是很清楚

from xml.etree import ElementTree

tree = ElementTree.parse('Store.xml')
root = tree.getroot()

for att in root:
    first = att.find('shops').text
    print('{}'.format(first))

but i was getting None here.但我在这里没有。

Not a complete solution but in answer to why you were getting None , it's because your shops are actually one level deeper, under the data tag.这不是一个完整的解决方案,但可以回答您为什么会得到None ,这是因为您的商店实际上在data标签下更深一层。

This bit of code might give you an idea of how to access the underlying attributes, which you can collect into a list or other container to build your CSV.这段代码可能会让您了解如何访问底层属性,您可以将其收集到列表或其他容器中以构建您的 CSV。

from xml.etree import ElementTree

tree = ElementTree.parse('Store.xml')
root = tree.getroot()
data = root.find('data')

for shops in data:
    for shop in shops:
        name = shop.find('name').text
        sid = shop.attrib
        status = shop.find('status').text
        attrs = shop.find('shopAttributes')
        open_date = attrs.find(".//shopAttribute/[@attrName='shop_OPEN_DATE']").text
        print(f"Name: {name}, ID: {sid}, Status: {status}, open: {open_date}")

The open_date is an example of how to use XPath to access attributes. open_date是如何使用XPath访问属性的示例。 The code returns:代码返回:

Name: Bannockburn, ID: {'id': '204019'}, Status: Open, open: 2008-07-16
Name: Plainfield - North Plainfield, ID: {'id': '204020'}, Status: Open, open: 2007-05-18

Shops has no text, so it'll print nothing. Shops 没有文本,因此它不会打印任何内容。 You need to get down to the level you want你需要下降到你想要的水平

for att in root.findall('./data/shops/shop’):
    first = att.find('name')
    print('{}'.format(first.text))

Gives

Bannockburn
Plainfield - North Plainfield

Here is a good resource for ElementTree: https://docs.python.org/3/library/xml.etree.elementtree.html这是 ElementTree 的一个很好的资源: https://docs.python.org/3/library/xml.etree.elementtree.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM