无法使用Python解析目录中的多个文件

Question

I have 2 files in a directory. 我的目录中有2个文件。 One directory contains start and end time of ETL jobs in morning and the other contains same data for the evening. 一个目录包含早晨ETL作业的开始和结束时间，另一个目录包含傍晚的相同数据。 I am trying to write a Python program to read the file and their contents and give an excel output which contains the file name, date, start time and end time. 我正在尝试编写一个Python程序来读取文件及其内容，并给出一个包含文件名，日期，开始时间和结束时间的excel输出。

My Code is written below: 我的代码如下：

path = r"path_name"
regex = '(.*?) - (.*?) - Starting entry (.*?)'
regex_1 = '(.*?) - (.*?) - Clear TMP table'
regex_2 = '(.*?) - (.*?) - Finished job'
for filename in glob.glob("*.log"):
    with open(filename, "r") as file:
        file_list = []
        table_list = []
        start_list = []
        end_list = []
        for line in file:
            line = line.replace('[','')
            line = line.replace(']','')
            line = line.replace('(','')
            line = line.replace(')','')
            for match in re.finditer(regex, line, re.S):
                match_text = match.group()
                print match_text
                searchfile = re.search(' - (.+?) - ', match_text)
                if searchfile:
                    filename = searchfile.group(1)
                    file_list.append(filename)
                    print(filename)
            for match in re.finditer(regex_1, line, re.S):
                match_text_1 = match.group()
                print match_text_1      
                searchtable = re.search(' - (.+?) - ', match_text_1)
                if searchtable:
                    tablename = searchtable.group(1)
                    table_list.append(tablename)
                    print(tablename)
                    starttime = match_text_1[0:19]
                    start_list.append(starttime)
                    print(starttime)
            for match in re.finditer(regex_2, line, re.S):
                match_text_2 = match.group()
                print match_text_2 
                endtime = match_text_2[0:19]
                end_list.append(endtime)
                print(endtime)

The issue here is that only one file is being read and written. 这里的问题是只读取和写入一个文件。 I am not able to understand why that's happening. 我不明白为什么会这样。 If I am printing the length of file_list, it contains 400 rows but ideally there should be 800 rows since I am parsing 2 files. 如果我要打印file_list的长度，它包含400行，但是理想情况下应该有800行，因为我正在解析2个文件。 Can someone pls help me with this? 有人可以帮我吗？

Answer 1

Initialize file_list outside the loop and then use append to populate data. 在循环外部初始化file_list ，然后使用append填充数据。

ie 即

file_list = []
for filename in glob.glob('*.log'):
    if some_condition:
        file_list.append(filename)

In your case, the file_list gets initialized in every iteration and hence only half the data is present. 在您的情况下，file_list在每次迭代中都会初始化，因此只有一半的数据存在。

无法使用Python解析目录中的多个文件

问题描述

1 个解决方案

解决方案1
1 2019-09-16 16:38:06

无法使用Python解析目录中的多个文件

问题描述

1 个解决方案

解决方案1 1 2019-09-16 16:38:06

解决方案1
1 2019-09-16 16:38:06