[英]XML to CSV with python string indices must be integers
I'm trying to convert an xml to csv but when i'm running the python script i'm getting this error: 我正在尝试将xml转换为csv,但是当我运行python脚本时,出现此错误:
TypeError: string indices must be integers
The XML structure ( it's bigger but its always the same ): XML结构(虽然更大,但始终相同):
<?xml version='1.0' encoding='UTF-8'?>
<import>
<products>
<product>
<attribute>
<code>Something</code>
<value>xxx</value>
</attribute>
<attribute>
<code>Something2</code>
<value>xxx</value>
</attribute>
<attribute>
<code>Something3</code>
<value>xxx</value>
</attribute>
<attribute>
<code>Something4</code>
<value>xxx</value>
</attribute>
</product>
</products>
</import>
The python file: python文件:
#!/usr/bin/python
# -*- coding: utf-8 -*-
import csv, xmltodict
from collections import OrderedDict
class Test:
def PSXML(self):
FilePS = open('test.csv', 'w')
csvwriter = csv.writer(FilePS)
header = ['Something1','Something2','Something3','Something4']
csvwriter.writerow(header)
with open('test.xml') as fd:
PSdata = []
obj = xmltodict.parse(fd.read())
obj = obj['import']['products']
root_elements = obj['product'] if type(obj) == OrderedDict else [obj['product']]
for element in root_elements:
Something1 = element['attribute'][1]['value']
PSdata.append(Something1)
Something2 = element['attribute'][2]['value']
PSdata.append(Something2)
Something3 = element['attribute'][3]['value']
PSdata.append(Something3)
Something4 = element['attribute'][4]['value']
PSdata.append(Something4)
csvwriter.writerow(PSdata)
FilePS.close()
TryIT = Test()
TryIT.PSXML()
This code already worked with another XML structure ( a more logic one ) but on this one is crashing with that TypeError: string indices must be integers error. 这段代码已经可以与另一种XML结构一起使用(一种更逻辑的结构),但是在该结构上由于TypeError崩溃:字符串索引必须是整数错误。
Any one have an idea why is that ? 有人知道为什么吗?
The problem here is, that in your example you just got one product. 这里的问题是,在您的示例中,您只有一种产品。 So as Elis said, in the loop
element
is just the string attribute
. 因此,正如Elis所说,循环
element
中只是string attribute
。
Actually you already tried to cover the possibility of just one product but you made a mistake there. 实际上,您已经尝试过仅涵盖一种产品的可能性,但是您在那里犯了一个错误。 You have to put it in a list, if it IS an OrderedDict:
如果它是OrderedDict,则必须将其放在列表中:
root_elements = obj['product'] if type(obj) != OrderedDict else [obj['product']]
The code works well if there are several products except a couple of small mistakes. 如果有几种产品,但有一些小错误,则该代码效果很好。
You have to put the PSdata
initialization inside the loop where you iterate over your products, because otherwise, with every product, 4 new columns get attached and the values of the former product written in the same line. 您必须将
PSdata
初始化放入循环访问产品的循环中,因为否则,对于每个产品,将附加4个新列,并将先前产品的值写在同一行中。
So you might want to check if there is only one product in your root_elements and handle this case separately. 因此,您可能要检查root_elements中是否只有一种产品,然后分别处理这种情况。
Furthermore don't name your variables uppercase. 此外,不要将变量命名为大写。
Another point: lists are zero indexed in Python, so to get the 4 value you should go with: 另一点:列表在Python中的索引为零,因此要获取4值,应使用:
for element in root_elements:
psdata = []
something1 = element['attribute'][0]['value']
psdata.append(something1)
something2 = element['attribute'][1]['value']
psdata.append(something2)
something3 = element['attribute'][2]['value']
psdata.append(something3)
something4 = element['attribute'][3]['value']
csvwriter.writerow(psdata)
Or shorter with a list comprehension: 或更短的列表理解:
for element in root_elements:
csvwriter.writerow([element['attribute'][i]['value'] for i in range(4)])
So here is an updated version of your script, which follows most of pep8: 因此,这是您的脚本的更新版本,遵循大多数pep8:
#!/usr/bin/python
# -*- coding: utf-8 -*-
import csv
import xmltodict
from collections import OrderedDict
class Test:
def psxml(self):
with open('test.csv', 'w') as file_ps:
csvwriter = csv.writer(file_ps)
header = ['Something1', 'Something2', 'Something3', 'Something4']
csvwriter.writerow(header)
with open('test.xml') as fd:
obj = xmltodict.parse(fd.read())
obj = obj['import']['products']
root_elements = obj['product'] if type(obj) != OrderedDict else [obj['product']]
for element in root_elements:
csvwriter.writerow([element['attribute'][i]['value'] for i in range(4)])
try_it = Test()
try_it.psxml()
Try: 尝试:
for element in root_elements:
print element, type(element)
will print: 将打印:
attribute <type 'unicode'>
You may expect a dict but it is a string. 您可能期望有一个字典,但这是一个字符串。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.