使用Python（Pandas）将文件夹中的所有xml文件附加到单个Dataframe

Question

I have a set of xml files in a folder that I am trying to convert to csv and later append them to one Dataframe. 我在试图转换为csv的文件夹中有一组xml文件，后来将它们附加到一个Dataframe中。 The code below helps me to convert xml file to csv. 下面的代码可以帮助我将xml文件转换为csv。 The problem I have however is only the first file gets converted to csv and not the remaining files. 但是，我的问题是只有第一个文件转换为csv，而不是其余文件。 Could anyone guide as to where am I going wrong in the below code: 谁能在下面的代码中指导我哪里出错了：

for file in allFiles:
    print(file)
    def iter_docs(file):
        for docall in file:
            doc_dict = {}
            for doc in docall:
                tag = [elem.tag for elem in doc]
                txt = [elem.text for elem in doc]
                if len(tag) > 0:
                    doc_dict.update(dict(zip(tag, txt)))
                    else:
                        doc_dict[doc.tag] = doc.text
                    yield doc_dict
     etree = ET.parse(file_)
     df_0 = pd.DataFrame(list(iter_docs(etree.getroot())))
     df_0.to_csv("file.csv", index=False)

Answer 1

Create the DataFrame df_0 appending all your data in the xml files and then save to csv file: 创建DataFrame df_0将所有数据附加到xml文件中，然后保存到csv文件中：

df_0 = pd.DataFrame()    # Create df to store all your data
for file in allFiles:        
    print(file)
    def iter_docs(file):
        for docall in file:
            doc_dict = {}
            for doc in docall:
                tag = [elem.tag for elem in doc]
                txt = [elem.text for elem in doc]
                if len(tag) > 0:
                    doc_dict.update(dict(zip(tag, txt)))
                    else:
                        doc_dict[doc.tag] = doc.text
                    yield doc_dict
     etree = ET.parse(file_)
     df_0 = df_0.append(pd.DataFrame(list(iter_docs(etree.getroot()))))    # Append data
df_0.to_csv("file.csv", index=False)

使用Python（Pandas）将文件夹中的所有xml文件附加到单个Dataframe

问题描述

1 个解决方案

解决方案1
0 2019-03-04 10:12:46

使用Python（Pandas）将文件夹中的所有xml文件附加到单个Dataframe

问题描述

1 个解决方案

解决方案1 0 2019-03-04 10:12:46

解决方案1
0 2019-03-04 10:12:46