简体   繁体   中英

How to transform a csv to xml file using ElementTree on Python 3.5

I am new to Python and I have no big experience with this language. I have a CSV file from where I have to get the data into an XML structure. I want to do it with Pandas and ElementTree .

I read a tutorial to do so, but I can't understand the structure of the code.

The CSV file looks something like this

test_name,health_feat,result
test_1,20,1
test_2,23,1
test_3,24,0
test_4,12,1
test_5,45,0
test_6,34,1
test_7,78,1
test_8,23,1
test_9,12,1
test_10,12,1

The final XML file should look like this, but I am not sure how to handle attributes when applying ElementTree :

<xml version = '1.0' encoding = 'UTF-8'>
    <Test Testname = 'test_1' >
        <Health_Feat>20</health_feat>
        <Result>1</Result>
    </Test>
    <Test Testname = 'test_2'>
        <Health_Feat>23</Healt_Feat>
        <Result>1</Result>
    </Test>
    <Test Testname = 'test_3'>
        <Health_Feat>24</Healt_Feat>
        <Result>0</Result>
    </Test>
    <Test Testname = 'test_4'>
        <Health_Feat>30</Healt_Feat>
        <Result>1</Result>
    </Test>
    <Test Testname = 'test_5'>
        <Health_Feat>12</Healt_Feat>
        <Result>1</Result>
    </Test>
    <Test Testname = 'test_6'>
        <Health_Feat>45</Healt_Feat>
        <Result>1</Result>
    </Test>
    <Test Testname = 'test_7'>
        <Health_Feat>34</Healt_Feat>
        <Result>0</Result>
    </Test>
    <Test Testname = 'test_8'>
        <Health_Feat>78</Healt_Feat>
        <Result>1</Result>
    </Test>
    <Test Testname = 'test_9'>
        <Health_Feat>23</Healt_Feat>
        <Result>1</Result>
    </Test>
    <Test Testname = 'test_10'>
        <Health_Feat>12</Healt_Feat>
        <Result>1</Result>
    </Test>
</Tests>

Currently I tried something like this, but I don't know how to tell the program which line to take from the csv.

import pandas as pd
from lxml import etree as et
import uuid

df = pd.read_csv('mytests.csv', sep = ',')

root = et.Element(Tests)

for index, row in df.iterrows():
    if row['test_name'] == 'test_1':
        Test = et.SubElement(root, 'Test')
        Test.attrib['fileUID']
        health_feat = et.subElement('health_feat')
        Result = et.subElement('Result')
    else:
        Tests = et.subElement(root, 'Tests')
        
et.ElementTree(root).write('mytests.xml', pretty_print = True, xml_declaration = True, encoding = 'UTF-8', standalone = None)

Something like this:

import pandas as pd
df = pd.read_csv('your_csv.csv', sep=',')


def csv_to_xml(row):
    return """<Test Testname="%s">
    <Health_Feat>%s</Health_Feat>
    <Result>%s</Result>
    </Test>""" % (row.test_name, row.health_Feat, row.Result)

and call the function for every row of your csv in a for loop

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM