简体   繁体   English

用python解析XML到CSV

[英]Parse XML to CSV with python

I need to parse some XML to CSV. I am struggling getting the 'record' attribute to iterate.我需要将一些 XML 解析为 CSV。我正在努力让“记录”属性进行迭代。 The code below can pull out the allocation text.下面的代码可以提取分配文本。 How do I get the record product-id?我如何获得记录产品 ID?

    import xml.etree.ElementTree as ET
    mytree = ET.parse('Salesforce_01_30_2023.xml')
    myroot = mytree.getroot()
    print(myroot)
    
    for x in myroot.findall('record'):
        product = myroot.attrib
        inventory = x.find('allocation').text
        print(product, inventory)

XML XML

<?xml version="1.0" encoding="UTF-8"?>
        <records>
            <record product-id="99124">
                <allocation>15</allocation>
                <allocation-timestamp>2023-01-30T15:03:39.598Z</allocation-timestamp>
                <perpetual>false</perpetual>
                <preorder-backorder-handling>none</preorder-backorder-handling>
                <ats>15</ats>
            </record>
            <record product-id="011443">
                <allocation>0</allocation>
                <allocation-timestamp>2023-01-30T15:03:39.598Z</allocation-timestamp>
                <perpetual>false</perpetual>
                <preorder-backorder-handling>none</preorder-backorder-handling>
                <ats>0</ats>
            </record>

To get product-id number you can use .attrib["product-id"] :要获取product-id号,您可以使用.attrib["product-id"]

import xml.etree.ElementTree as ET

mytree = ET.parse('Salesforce_01_30_2023.xml')
myroot = mytree.getroot()

for product in myroot.findall('record'):
    inventory = product.find('allocation').text
    print(product.attrib['product-id'], inventory)

Prints:印刷:

99124 15
011443 0

Option 1 : You can use pandas DataFrame read_xml() and to_csv() :选项 1 :您可以使用pandas DataFrame read_xml()to_csv()

import pandas as pd

df = pd.read_xml("prod_id.xml", xpath=".//record")
df.to_csv('prod.csv')
print(df.to_string())

Output: Output:

   product-id  allocation      allocation-timestamp  perpetual preorder-backorder-handling  ats
0       99124          15  2023-01-30T15:03:39.598Z      False                        none   15
1       11443           0  2023-01-30T15:03:39.598Z      False                        none    0

CSV: CSV:

,product-id,allocation,allocation-timestamp,perpetual,preorder-backorder-handling,ats
0,99124,15,2023-01-30T15:03:39.598Z,False,none,15
1,11443,0,2023-01-30T15:03:39.598Z,False,none,0

Option 2 , if you prefere the xml.etree.ElementTree.选项 2 ,如果您喜欢 xml.etree.ElementTree。 xml attribute values can be searched by .get() : xml 属性值可以通过.get()搜索:

import xml.etree.ElementTree as ET
    
tree = ET.parse('prod_id.xml')
root = tree.getroot()
    
for elem in root.iter():
    # print(elem.tag, elem.attrib, elem.text)
    if elem.tag == "record":
        print("Product-id:",elem.get('product-id'))

Output: Output:

Product-id: 99124
Product-id: 011443

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM