簡體   English   中英

Python列出最后10個修改的文件並讀取所有10個文件的每一行

[英]Python listing last 10 modified files and reading each line of all 10 files

我需要一些幫助,在目錄中列出文件並使用Python讀取每個文件。 我知道如何使用Shell命令來執行此操作,但是有Python方式可以做到嗎?

我想要:

1.)列出目錄中的所有文件。

2.)獲取最近的10個修改/最新文件(最好使用通配符)

3.)讀取所有10個文件的每一行

使用shell命令,我可以:

Linux_System# ls -ltr | tail -n 10 
-rw-rw-rw- 1 root root  999934 Jul 26 01:06 data_log.569
-rw-rw-rw- 1 root root  999960 Jul 26 02:05 data_log.570
-rw-rw-rw- 1 root root  999968 Jul 26 03:13 data_log.571
-rw-rw-rw- 1 root root  999741 Jul 26 04:20 data_log.572
-rw-rw-rw- 1 root root  999928 Jul 26 05:31 data_log.573
-rw-rw-rw- 1 root root  999942 Jul 26 06:45 data_log.574
-rw-rw-rw- 1 root root  999916 Jul 26 07:46 data_log.575
-rw-rw-rw- 1 root root  999862 Jul 26 08:59 data_log.576
-rw-rw-rw- 1 root root  999685 Jul 26 10:15 data_log.577
-rw-rw-rw- 1 root root  999633 Jul 26 11:26 data_log.578

Linux_System# cat data_log.{569..578}

使用glob,我能夠列出文件並打開特定文件,但是不確定如何列出僅10個修改后的文件並將通配符文件列表提供給open函數。

import os, fnmatch, glob

files = glob.glob("data_event_log.*")
files.sort(key=os.path.getmtime)
print("\n".join(files))

data_event_log.569
data_event_log.570
data_event_log.571
data_event_log.572
data_event_log.573
data_event_log.574
data_event_log.575
data_event_log.576
data_event_log.577
data_event_log.578

with open(data_event_log.560, 'r') as f:
    output_list = []
    for line in f.readlines():
        if line.startswith('Time'):
            lineRegex = re.compile(r'\d{4}-\d{2}-\d{2}')
            a = (lineRegex.findall(line))

看起來差不多,您幾乎已經完成了所有操作

import os.path, glob

files = glob.glob("data_event_log.*")
files.sort(key=os.path.getmtime)
latest=files[-10:] # last 10 entries
print("\n".join(latest))
lineRegex = re.compile(r'\d{4}-\d{2}-\d{2}')
for fn in latest:
    with open(fn) as f:
        for line in f:
            if line.startswith('Time'):          
                a = lineRegex.findall(line)

編輯:

尤其是如果您有許多文件,則更好,更簡單的解決方案是

import os.path, glob, heapq

files = glob.iglob("data_event_log.*")
latest=heapq.nlargest(10, files, key=os.path.getmtime) # last 10 entries
print("\n".join(latest))
lineRegex = re.compile(r'\d{4}-\d{2}-\d{2}')
for fn in latest:
    with open(fn) as f:
        for line in f:
            if line.startswith('Time'):          
                a = lineRegex.findall(line)

您正在尋找的是固定大小的排序緩沖區。 盡管沒有排序, collections.deque這樣做。 因此,這是一個緩沖區,它將滿足您的需求,並且main向您展示如何使用它

import bisect
import glob
import operator
import os


class Buffer:
    def __init__(self, maxlen, minmax=1, key=None):
        if key is None: key = lambda x: x
        self.key = key
        self.maxlen = maxlen
        self.buffer = []
        self.keys = []
        self.minmax = minmax  # 1 to track max values, -1 to track min values

        # iterator variables
        self.curr = 0

    def __iter__(self): return self

    def __next__(self):
        if self.curr >= len(self.buffer): raise StopIteration
        self.curr += 1
        return self.buffer[self.curr-1]

    def insert(self, x):
        key = self.key(x)
        idx = bisect.bisect_left(self.keys, key)
        self.keys.insert(idx, key)
        self.buffer.insert(idx, x)
        if len(self.buffer) > self.maxlen:
            if self.minmax>0:
                self.buffer = self.buffer[-1 * self.maxlen :]
                self.keys = self.keys[-1 * self.maxlen :]
            elif self.minmax<0:
                self.buffer = self.buffer[: self.maxlen]
                self.keys = self.keys[: self.maxlen]


def main():
    dirpath = "/path/to/directory"
    modtime = lambda fpath: os.stat(fpath).st_mtime
    buffer = Buffer(10, 1, modtime)
    for fpath in glob.glob(os.path.join(dirpath, "*data_event_log.*")):
        buffer.insert(fpath)

    for fpath in buffer:
        # open the file path and print whatever

pythonic答案:

使用帶有lambda函數的sorted() ,然后使用列表切片來獲取最早的10個或最新的10個或您擁有的東西。

from glob import glob
from os import stat

files = glob("*")
sorted_list = sorted(files, key=lambda x: stat(x).st_mtime)

truncated_list = sorted_list[-10:]

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM