日志文件的python tail -f 连续

Question

I implemented python tail -f with following code snippet which works completely fine as my program run continually in background by python myprogram.py &我使用以下代码片段实现了 python tail -f，它完全正常，因为我的程序通过python myprogram.py &在后台连续运行

def follow(thefile):
    thefile.seek(0,2)
    while True:
        line = thefile.readline()
        if not line:
            time.sleep(0.1)
            continue
        yield line

the file which is passed to above function is a log file which and is passed from main传递给上述函数的文件是一个日志文件，它是从 main 传递的

    # follow.py

    # Follow a file like tail -f.

import smtplib
import time
import re
import logging

# Here are the email package modules we'll need
from email.mime.image import MIMEImage
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from job import Job


def follow(thefile):
    thefile.seek(0,2)
    while True:
        line = thefile.readline()
        if not line:
            time.sleep(0.1)
            continue
        yield line

def sendMail(job,occurtime):
    COMMASPACE = ', '
    outer =  MIMEMultipart()
    # msg = MIMEMultipart('alternative')
    outer['Subject'] = 'ETL Failed for job:' + job
    outer['From'] = 'eltmonitor@fms.com'
    me=  'eltmonitor@ncellfms.com'
    family = ["bibesh.pokhrel@huawei.com"]
    outer['To'] = COMMASPACE.join(family)

    inner = MIMEMultipart('alternative')
    html = """\
        <html>
          <head></head>
          <body>
            <p>Dears,<br>
               Please take necessary action to troubleshoot the ETL Error for job:""" + job + " at " + occurtime + """
            </p>
          </body>
        </html>
        """

# Record the MIME types of both parts - text/plain and text/html.
    part2 = MIMEText(html, 'html')

# Attach parts into message container.
# According to RFC 2046, the last part of a multipart message, in this case
# the HTML message, is best and preferred.
    inner.attach(part2)
    outer.attach(inner)

# Connect to SMTP server and send the email
# Parameter are from=me, to=family and outer object as string for message body
    try:
        s = smtplib.SMTP('localhost')
        s.sendmail(me,family,outer.as_string())
        s.quit()
    except SMTPException:
        logging.info('Unable to send email')

if __name__ == '__main__':
    while True:
        logging.basicConfig(filename='/opt/etlmonitor/monitor.log',format='%(asctime)s %(levelname)s %(message)s',level=logging.DEBUG, filemode='w')
# Define two ETL Job object to store the state of email sent as boolean flag
        fm =Job()
        ncell =Job()
        try:
            with open("/opt/report/logs/GraphLog.log","r") as logfile:
            # Continually read the log files line by line

                loglines = follow(logfile)

            # Do something with the line
                for line in loglines:
            # Extract the last word in the line of log file
            # We are particulary looking for SUCCESS or FAILED word
            # Warning!! leading whitespace character is also matched
                    etlmsg= re.search(".*(\s(\w+)$)",line)
                    if etlmsg:
            # Remove leading whitespace
                        foundmsg = etlmsg.group(1).lstrip()
            # Process on the basis of last word
            # If it is SUCCESS , set the job mailsent flag to False so that no email is sent
            # If it is FAILED and mailsent flag of job is False, send a email and set mailsent flag to True
            # If it is FAILED and mailsent flag of job is True, do nothing as email was already sent
                        if foundmsg=='SUCCESS':
                            jobname= re.search(": Graph '(.+?)\'",line)
                            if jobname:
                                foundjob= jobname.group(1)
                                if foundjob =='Mirror.kjb':
                                    logging.info('Ncell Mirror job detected SUCCESS')
                                    ncell.p == False
                                elif foundjob =='FM_job.kjb':
                                    fm.p == False
                                    logging.info('Ncell Report job detected SUCCESS')
                                else:
                                    logging.info('No job name defined for success message')

                        elif foundmsg =='FAILED':
                            jobname= re.search(": Graph '(.+?)\'",line)
                            timevalue=re.search("(.+?)\,",line)
                            if jobname and timevalue:
                                foundjob= jobname.group(1)
                                foundtime = timevalue.group(1)
                                if foundjob =='Mirror.kjb':
                                    if ncell.p == True:
                                        logging.info('Notification Email has been already sent for job: ' + foundjob)
                                    elif ncell.p == False :
                                        ncell.p = True
                                        sendMail(foundjob,foundtime)
                                    else:
                                        logging.info("state not defined")
                                elif foundjob =="FM_job.kjb":
                                    if fm.p == True:
                                        logging.info('Notification Email has been already sent for job: ' + foundjob)
                                    elif fm.p == False:
                                        fm.p = True
                                        sendMail(foundjob,foundtime)
                                    else:
                                        logging.info('Unkown state of job')
                                else:
                                    logging.info('New job name found')

        except IOError:
            logging.info('Log file could not be found or opened')

What I am actually doing with the line is reading the very last word in the line with regular expression and do some task based on the last word that is received.我实际上对这一行所做的是使用正则表达式读取该行中的最后一个单词，并根据收到的最后一个单词执行一些任务。

The problem is that, the log file (GraphLog.log) is being rolled in based on file size.问题是，日志文件 (GraphLog.log) 是根据文件大小滚动进来的。 When this happen, my program also stops.发生这种情况时，我的程序也会停止。 How do I continually read that GraphLog.log file without my program being terminated ( with out some error) even after the log file is rolled by file size and date.即使在日志文件按文件大小和日期滚动后，我如何在不终止程序（没有一些错误）的情况下连续读取该 GraphLog.log 文件。

Any help is much appreciated.非常感谢任何帮助。

Answer 1

When the file is rotated ("rolled" as you put it), the file you're reading from is renamed or deleted and another one is created in its place.当文件旋转（如您所说的“滚动”）时，您正在读取的文件将被重命名或删除，并在其位置创建另一个文件。 Your reads still go to the original file.您的读取仍会转到原始文件。 (If the file was deleted its contents remain in place, until you close it.) Consequently, what you need to do is periodically (eg in your follow loop) check the return value of os.stat(filename).st_ino . （如果文件被删除，它的内容会保留，直到你关闭它。）因此，你需要做的是定期（例如在你的follow循环中）检查os.stat(filename).st_ino的返回值。 If that has changed, you need to close the current file, reopen it again, and start reading from the beginning.如果更改了，您需要关闭当前文件，重新打开它，然后从头开始阅读。

Note that there are ways to do this more efficiently, without periodic polling, through the OS's events mechanism.请注意，有一些方法可以通过操作系统的事件机制更有效地执行此操作，而无需定期轮询。 See eg the watchdogs API .参见例如看门狗 API 。

日志文件的python tail -f 连续

问题描述

1 个解决方案

解决方案1
0 2020-03-04 15:01:36

日志文件的python tail -f 连续

问题描述

1 个解决方案

解决方案1 0 2020-03-04 15:01:36

解决方案1
0 2020-03-04 15:01:36