简体   繁体   English

从文件夹中打开并读取多个xml文件

[英]Open and read multiple xml files from the folder

the below holder have 100+ XML files. 以下持有人拥有100多个XML文件。 I have to open and read all those files. 我必须打开并阅读所有这些文件。

F:\\Process\\Process_files\\xmls F:\\过程\\ Process_files \\个XML

So far, I did the below code to open single XML file from the folder. 到目前为止,我执行了以下代码以从文件夹中打开单个XML文件。 What I need to change to open/read all the XML files from the folder. 我需要更改以打开/读取文件夹中的所有XML文件。

from bs4 import BeautifulSoup
import lxml
import pandas as pd

infile = open("F:\\Process\\Process_files\\xmls\\ABC123.xml","r")
contents = infile.read()
soup = BeautifulSoup(contents,'html.parser')

Use the glob and the os module to iterate over every file in a given path with a given file extension: 使用globos模块遍历具有给定文件扩展名的给定path每个文件:

import glob
import os

path = "F:/Process/Process_files/xmls/"

for filename in glob.glob(os.path.join(path, "*.xml")):
    with open(filename) as open_file:
        content = open_file.read()

    soup = BeautifulSoup(content, "html.parser")

Tip: Use the with statement so the file gets automatically closed at the end. 提示:使用with语句,使文件最后自动关闭。

Source: Open Every File In A Folder 来源: 打开文件夹中的每个文件

So you need to iterate over files in the folder? 因此,您需要遍历文件夹中的文件? You can try something like this: 您可以尝试如下操作:

for file in os.listdir(path):
    filepath = os.path.join(path, file)
    with open(filepath) as fp:
        contents = fp.read()
        soup = BeautifulSoup(contents, 'html.parser')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM