[英]Remove xml character in python
I have the following code, which takes information from an XML file and saves some data in a csv file. 我有以下代码,该代码从XML文件获取信息并将一些数据保存在csv文件中。
import xml.etree.ElementTree as ET
import csv
tree = ET.parse('file.xml')
root = tree.getroot()
title = []
category = []
url = []
prod = []
def find_title():
for t in root.findall('solution/head'):
title.append(t.find('title').text)
for c in root.findall('solution/body'):
category.append(c.find('category').text)
for u in root.findall('solution/body'):
url.append(u.find('video').text)
for p in root.findall('solution/body'):
prod.append(p.find('product').text)
find_title()
headers = ['Title', 'Category', 'Video URL','Product']
def save_csv():
with open('titles.csv', 'w') as f:
f_csv = csv.writer(f, lineterminator='\r')
f_csv.writerow(headers)
f.write(''.join('{},{},{},{}\n'.format(title, category, url, prod) for title, category, url, prod in zip(title, category, url, prod)))
save_csv()
I have found an issue with the text that contains ',' because it separates the output save in the list eg: 我发现包含','的文本存在问题,因为它会将输出保存在列表中,例如:
<title>Add, Change, or Remove Transitions between Slides</title>
is getting save in the list as [Add, Change, or Remove Transitions between Slides] which make sense since this is a csv file, however, I would like to keep the whole output together. 正以[添加,更改或删除幻灯片之间的过渡]的形式保存在列表中,这是有道理的,因为这是一个csv文件,但是,我希望将整个输出保持在一起。
So I there any way to remove the ',' from the title tag or can I add more code to override the ',' 因此,我有任何方法可以从标题标签中删除“,”,也可以添加更多代码以覆盖“,”
Thanks in advance 提前致谢
It's not clear why you're writing the row data with a file.write()
call rather than using the csv writer's writerow
method (which you are using for the header row. Using that method will take care of quoting / special character issues wrt. data containing quotes and commas. 目前尚不清楚为什么要使用
file.write()
调用而不是使用csv writer的writerow
方法(用于标题行file.write()
写入行数据。使用该方法将处理引号/特殊字符问题。包含引号和逗号的数据。
Change: 更改:
f.write(''.join('{},{},{},{}\n'.format(title, category, url, prod) for title, category, url, prod in zip(title, category, url, prod)))
to: 至:
for row in zip(title, category, url, prod):
f_csv.writerow(row)
and your CSV should work as expected, assuming your CSV reader handles the quoted fields. 并且假设您的CSV阅读器可以处理引用的字段,那么CSV应该可以正常工作。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.