简体   繁体   中英

Python tail -f of log file continuously

I implemented python tail -f with following code snippet which works completely fine as my program run continually in background by python myprogram.py &

def follow(thefile):
    thefile.seek(0,2)
    while True:
        line = thefile.readline()
        if not line:
            time.sleep(0.1)
            continue
        yield line

the file which is passed to above function is a log file which and is passed from main

    # follow.py

    # Follow a file like tail -f.

import smtplib
import time
import re
import logging

# Here are the email package modules we'll need
from email.mime.image import MIMEImage
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from job import Job


def follow(thefile):
    thefile.seek(0,2)
    while True:
        line = thefile.readline()
        if not line:
            time.sleep(0.1)
            continue
        yield line

def sendMail(job,occurtime):
    COMMASPACE = ', '
    outer =  MIMEMultipart()
    # msg = MIMEMultipart('alternative')
    outer['Subject'] = 'ETL Failed for job:' + job
    outer['From'] = 'eltmonitor@fms.com'
    me=  'eltmonitor@ncellfms.com'
    family = ["bibesh.pokhrel@huawei.com"]
    outer['To'] = COMMASPACE.join(family)

    inner = MIMEMultipart('alternative')
    html = """\
        <html>
          <head></head>
          <body>
            <p>Dears,<br>
               Please take necessary action to troubleshoot the ETL Error for job:""" + job + " at " + occurtime + """
            </p>
          </body>
        </html>
        """

# Record the MIME types of both parts - text/plain and text/html.
    part2 = MIMEText(html, 'html')

# Attach parts into message container.
# According to RFC 2046, the last part of a multipart message, in this case
# the HTML message, is best and preferred.
    inner.attach(part2)
    outer.attach(inner)

# Connect to SMTP server and send the email
# Parameter are from=me, to=family and outer object as string for message body
    try:
        s = smtplib.SMTP('localhost')
        s.sendmail(me,family,outer.as_string())
        s.quit()
    except SMTPException:
        logging.info('Unable to send email')

if __name__ == '__main__':
    while True:
        logging.basicConfig(filename='/opt/etlmonitor/monitor.log',format='%(asctime)s %(levelname)s %(message)s',level=logging.DEBUG, filemode='w')
# Define two ETL Job object to store the state of email sent as boolean flag
        fm =Job()
        ncell =Job()
        try:
            with open("/opt/report/logs/GraphLog.log","r") as logfile:
            # Continually read the log files line by line

                loglines = follow(logfile)

            # Do something with the line
                for line in loglines:
            # Extract the last word in the line of log file
            # We are particulary looking for SUCCESS or FAILED word
            # Warning!! leading whitespace character is also matched
                    etlmsg= re.search(".*(\s(\w+)$)",line)
                    if etlmsg:
            # Remove leading whitespace
                        foundmsg = etlmsg.group(1).lstrip()
            # Process on the basis of last word
            # If it is SUCCESS , set the job mailsent flag to False so that no email is sent
            # If it is FAILED and mailsent flag of job is False, send a email and set mailsent flag to True
            # If it is FAILED and mailsent flag of job is True, do nothing as email was already sent
                        if foundmsg=='SUCCESS':
                            jobname= re.search(": Graph '(.+?)\'",line)
                            if jobname:
                                foundjob= jobname.group(1)
                                if foundjob =='Mirror.kjb':
                                    logging.info('Ncell Mirror job detected SUCCESS')
                                    ncell.p == False
                                elif foundjob =='FM_job.kjb':
                                    fm.p == False
                                    logging.info('Ncell Report job detected SUCCESS')
                                else:
                                    logging.info('No job name defined for success message')

                        elif foundmsg =='FAILED':
                            jobname= re.search(": Graph '(.+?)\'",line)
                            timevalue=re.search("(.+?)\,",line)
                            if jobname and timevalue:
                                foundjob= jobname.group(1)
                                foundtime = timevalue.group(1)
                                if foundjob =='Mirror.kjb':
                                    if ncell.p == True:
                                        logging.info('Notification Email has been already sent for job: ' + foundjob)
                                    elif ncell.p == False :
                                        ncell.p = True
                                        sendMail(foundjob,foundtime)
                                    else:
                                        logging.info("state not defined")
                                elif foundjob =="FM_job.kjb":
                                    if fm.p == True:
                                        logging.info('Notification Email has been already sent for job: ' + foundjob)
                                    elif fm.p == False:
                                        fm.p = True
                                        sendMail(foundjob,foundtime)
                                    else:
                                        logging.info('Unkown state of job')
                                else:
                                    logging.info('New job name found')

        except IOError:
            logging.info('Log file could not be found or opened')

What I am actually doing with the line is reading the very last word in the line with regular expression and do some task based on the last word that is received.

The problem is that, the log file (GraphLog.log) is being rolled in based on file size. When this happen, my program also stops. How do I continually read that GraphLog.log file without my program being terminated ( with out some error) even after the log file is rolled by file size and date.

Any help is much appreciated.

When the file is rotated ("rolled" as you put it), the file you're reading from is renamed or deleted and another one is created in its place. Your reads still go to the original file. (If the file was deleted its contents remain in place, until you close it.) Consequently, what you need to do is periodically (eg in your follow loop) check the return value of os.stat(filename).st_ino . If that has changed, you need to close the current file, reopen it again, and start reading from the beginning.

Note that there are ways to do this more efficiently, without periodic polling, through the OS's events mechanism. See eg the watchdogs API .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM