I am new to Python and I have no big experience with this language. I have a CSV file from where I have to get the data into an XML structure. I want to do it with Pandas and ElementTree
.
I read a tutorial to do so, but I can't understand the structure of the code.
The CSV file looks something like this
test_name,health_feat,result
test_1,20,1
test_2,23,1
test_3,24,0
test_4,12,1
test_5,45,0
test_6,34,1
test_7,78,1
test_8,23,1
test_9,12,1
test_10,12,1
The final XML file should look like this, but I am not sure how to handle attributes when applying ElementTree
:
<xml version = '1.0' encoding = 'UTF-8'>
<Test Testname = 'test_1' >
<Health_Feat>20</health_feat>
<Result>1</Result>
</Test>
<Test Testname = 'test_2'>
<Health_Feat>23</Healt_Feat>
<Result>1</Result>
</Test>
<Test Testname = 'test_3'>
<Health_Feat>24</Healt_Feat>
<Result>0</Result>
</Test>
<Test Testname = 'test_4'>
<Health_Feat>30</Healt_Feat>
<Result>1</Result>
</Test>
<Test Testname = 'test_5'>
<Health_Feat>12</Healt_Feat>
<Result>1</Result>
</Test>
<Test Testname = 'test_6'>
<Health_Feat>45</Healt_Feat>
<Result>1</Result>
</Test>
<Test Testname = 'test_7'>
<Health_Feat>34</Healt_Feat>
<Result>0</Result>
</Test>
<Test Testname = 'test_8'>
<Health_Feat>78</Healt_Feat>
<Result>1</Result>
</Test>
<Test Testname = 'test_9'>
<Health_Feat>23</Healt_Feat>
<Result>1</Result>
</Test>
<Test Testname = 'test_10'>
<Health_Feat>12</Healt_Feat>
<Result>1</Result>
</Test>
</Tests>
Currently I tried something like this, but I don't know how to tell the program which line to take from the csv.
import pandas as pd
from lxml import etree as et
import uuid
df = pd.read_csv('mytests.csv', sep = ',')
root = et.Element(Tests)
for index, row in df.iterrows():
if row['test_name'] == 'test_1':
Test = et.SubElement(root, 'Test')
Test.attrib['fileUID']
health_feat = et.subElement('health_feat')
Result = et.subElement('Result')
else:
Tests = et.subElement(root, 'Tests')
et.ElementTree(root).write('mytests.xml', pretty_print = True, xml_declaration = True, encoding = 'UTF-8', standalone = None)
Something like this:
import pandas as pd
df = pd.read_csv('your_csv.csv', sep=',')
def csv_to_xml(row):
return """<Test Testname="%s">
<Health_Feat>%s</Health_Feat>
<Result>%s</Result>
</Test>""" % (row.test_name, row.health_Feat, row.Result)
and call the function for every row of your csv in a for loop
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.