简体   繁体   English

Python 将 xml 文件转换为 csv

[英]Python convert xml files to csv

I have a directory that contain several xml files that I would like to able to treat all of them, one by one and export them as CSV files.我有一个目录,其中包含几个 xml 文件,我希望能够一一处理所有这些文件并将它们导出为 CSV 文件。 Individually, It works perfectly with the script below:单独来说,它与以下脚本完美配合:

import xml.etree.ElementTree as ET
import csv
tree = ET.parse('D:/scripts/xml/download_xml_1.xml')
data_out = open('D:/scripts/csv/output_1.csv', 'w',newline='', errors='ignore')
csvwriter = csv.writer(data_out)
col_names = ['Fichier','No. de document','Titre']
csvwriter.writerow(col_names)
root = tree.getroot()

for elem in root.iter(tag='Document'):
        row = []
        filetype = elem.find('FileType').text
        row.append(filetype)
        documentnumber = elem.find('DocumentNumber').text
        row.append(documentnumber)
        title = elem.find('Title').text
        row.append(title)
        csvwriter.writerow(row)
data_out.close()

But I'm going crazy to find the solution to do it, one by one and this where I am so far:但是我要疯狂地找到解决方案,一个一个,这就是我到目前为止的位置:

import xml.etree.ElementTree as ET
import csv
import os
for my_files in os.listdir('D:/scripts/xml/'):
    tree = ET.parse(my_files)
    data_out = open('D:/scripts/csv/'+ my_files[:-4] +'.csv', 'w',newline='', errors='ignore')
    csvwriter = csv.writer(data_out)
    col_names = ['Fichier','No. de document','Titre']
    csvwriter.writerow(col_names)
    root = tree.getroot()
    for elem in root.iter(tag='Document'):
        row = []
        filetype = elem.find('FileType').text
        row.append(filetype)
        documentnumber = elem.find('DocumentNumber').text
        row.append(documentnumber)
        title = elem.find('Title').text
        row.append(title)
        csvwriter.writerow(row)
data_out.close()

Any help would be greatly appreciated.任何帮助将不胜感激。

Simply generalize your process in a defined method that receives a file name as input.只需将您的过程概括为定义的方法,该方法接收文件名作为输入。 Then, iteratively pass file names to it.然后,迭代地将文件名传递给它。 Also, consider with context manager to open text connection without need to close .另外,考虑with上下文管理器打开文本连接而无需close

import os
import csv
import xml.etree.ElementTree as ET

xml_path = r'D:\scripts\xml'
csv_path = r'D:\scripts\csv'

# DEFINED METHOD
def xml_to_csv(xml_file):
    csv_file = os.path.join(csv_path, f'Output_{xml_file[:-4]}.csv')
    tree = ET.parse(os.path.join(xml_path, xml_file))

    with open(csv_file, 'w', newline='', errors='ignore') as data_out:
        csvwriter = csv.writer(data_out)
        col_names = ['Fichier', 'No. de document', 'Titre']
        csvwriter.writerow(col_names)
    
        root = tree.getroot()
        for elem in root.iter(tag='Document'):
            row = [elem.find('FileType').text,
                   elem.find('DocumentNumber').text,
                   elem.find('Title').text]
            csvwriter.writerow(row)

# FILE ITERATION
for f in os.listdir(xml_path):
    xml_to_csv(f)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM