[英]Appending all xml files in a folder to single Dataframe using Python (Pandas)
I have a set of xml files in a folder that I am trying to convert to csv and later append them to one Dataframe. 我在试图转换为csv的文件夹中有一组xml文件,后来将它们附加到一个Dataframe中。 The code below helps me to convert xml file to csv.
下面的代码可以帮助我将xml文件转换为csv。 The problem I have however is only the first file gets converted to csv and not the remaining files.
但是,我的问题是只有第一个文件转换为csv,而不是其余文件。 Could anyone guide as to where am I going wrong in the below code:
谁能在下面的代码中指导我哪里出错了:
for file in allFiles:
print(file)
def iter_docs(file):
for docall in file:
doc_dict = {}
for doc in docall:
tag = [elem.tag for elem in doc]
txt = [elem.text for elem in doc]
if len(tag) > 0:
doc_dict.update(dict(zip(tag, txt)))
else:
doc_dict[doc.tag] = doc.text
yield doc_dict
etree = ET.parse(file_)
df_0 = pd.DataFrame(list(iter_docs(etree.getroot())))
df_0.to_csv("file.csv", index=False)
Create the DataFrame df_0
appending all your data in the xml files and then save to csv file: 创建DataFrame
df_0
将所有数据附加到xml文件中,然后保存到csv文件中:
df_0 = pd.DataFrame() # Create df to store all your data
for file in allFiles:
print(file)
def iter_docs(file):
for docall in file:
doc_dict = {}
for doc in docall:
tag = [elem.tag for elem in doc]
txt = [elem.text for elem in doc]
if len(tag) > 0:
doc_dict.update(dict(zip(tag, txt)))
else:
doc_dict[doc.tag] = doc.text
yield doc_dict
etree = ET.parse(file_)
df_0 = df_0.append(pd.DataFrame(list(iter_docs(etree.getroot())))) # Append data
df_0.to_csv("file.csv", index=False)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.