简体   繁体   中英

python script to convert csv to xml

Please help to correct the python script to get the required output

I have written below code to convert csv to xml. In input file have column from 1 to 278. In output file need to have tag from A1 to A278,

Code :

#!/usr/bin/python
import sys
import os
import csv
if len(sys.argv) != 2:
    os._exit(1)
path=sys.argv[1] # get folder as a command line argument
os.chdir(path)
csvFiles = [f for f in os.listdir('.') if f.endswith('.csv') or f.endswith('.CSV')]
for csvFile in csvFiles:
    xmlFile = csvFile[:-4] + '.xml'
    csvData = csv.reader(open(csvFile))
    xmlData = open(xmlFile, 'w')
    xmlData.write('<?xml version="1.0"?>' + "\n")
    # there must be only one top-level tag
    xmlData.write('<TariffRecords>' + "\n")
    rowNum = 0
    for row in csvData:
        if rowNum == 0:
            tags = Tariff
            # replace spaces w/ underscores in tag names
            for i in range(len(tags)):
                tags[i] = tags[i].replace(' ', '_')
        else:
            xmlData.write('<Tariff>' + "\n")
            for i in range(len(tags)):
                xmlData.write('    ' + '<' + tags[i] + '>' \
                              + row[i] + '</' + tags[i] + '>' + "\n")
            xmlData.write('</Tariff>' + "\n")
        rowNum +=1
    xmlData.write('</TariffRecords>' + "\n")
    xmlData.close()

Getting below error from script:

Traceback (most recent call last):
  File "ctox.py", line 20, in ?
    tags = Tariff
NameError: name 'Tariff' is not defined

Sample Input file.(this is a sample record in actual input file will contain 278 columns). If input file has two or three records, same needs to be appended in one XML file.

name,Tariff Summary,Record ID No.,Operator Name,Circle (Service Area),list
Prepaid Plan Voucher,test_All calls 2p/s,TT07PMPV0188,Ta Te,Gu,
Prepaid Plan Voucher,test_All calls 3p/s,TT07PMPV0189,Ta Te,HR,

Sample output file The above two TariffRecords, tariff will be hard coded at the beginning and end of xml file.

<TariffRecords>
<Tariff>
<A1>Prepaid Plan Voucher</A1>
<A2>test_All calls 2p/s</A2>
<A3>TT07PMPV0188</A3>
<A4>Ta Te</A4>
<A5>Gu</A5>
<A6></A6>
</Tariff>
<Tariff>
<A1>Prepaid Plan Voucher</A1>
<A2>test_All calls 3p/s</A2>
<A3>TT07PMPV0189</A3>
<A4>Ta Te</A4>
<A5>HR</A5>
<A6></A6>
</Tariff>
</TariffRecords>

First off you need to replace

tags = Tariff with tags = row

Secondly you want to replace the write line to not write tags name but write A1, A2 etc..

Complete code:

import sys
import os
import csv
if len(sys.argv) != 2:
    os._exit(1)
path=sys.argv[1] # get folder as a command line argument
os.chdir(path)
csvFiles = [f for f in os.listdir('.') if f.endswith('.csv') or f.endswith('.CSV')]
for csvFile in csvFiles:
    xmlFile = csvFile[:-4] + '.xml'
    csvData = csv.reader(open(csvFile))
    xmlData = open(xmlFile, 'w')
    xmlData.write('<?xml version="1.0"?>' + "\n")
    # there must be only one top-level tag
    xmlData.write('<TariffRecords>' + "\n")
    rowNum = 0
    for row in csvData:
        if rowNum == 0:
            tags = row
            # replace spaces w/ underscores in tag names
            for i in range(len(tags)):
                tags[i] = tags[i].replace(' ', '_')
        else:
            xmlData.write('<Tariff>' + "\n")
            for i, index in enumerate(range(len(tags))):
                xmlData.write('    ' + '<' + 'A%s' % (index+1) + '>' \
                              + row[i] + '</' + 'A%s' % (index+1) + '>' + "\n")
            xmlData.write('</Tariff>' + "\n")
        rowNum +=1
    xmlData.write('</TariffRecords>' + "\n")
    xmlData.close()

Output:

<?xml version="1.0"?>
<TariffRecords>
<Tariff>
    <A1>Prepaid Plan Voucher</A1>
    <A2>test_All calls 2p/s</A2>
    <A3>TT07PMPV0188</A3>
    <A4>Ta Te</A4>
    <A5>Gu</A5>
    <A6></A6>
</Tariff>
<Tariff>
    <A1>Prepaid Plan Voucher</A1>
    <A2>test_All calls 3p/s</A2>
    <A3>TT07PMPV0189</A3>
    <A4>Ta Te</A4>
    <A5>HR</A5>
    <A6></A6>
</Tariff>
</TariffRecords>
import pandas as pd
from xml.etree import ElementTree as xml

df = pd.read_csv("file_path")
csv_data = df.values
root = xml.Element("TariffRecords")
tariff = xml.subelement("Tariff", root)
for index, data in enumarate(csv_data):
  row = xml.Element("A"+str(index), tariff)
  row.set(str(data)) 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM