简体   繁体   中英

How to read and truncate the snmptrapd log file without restarting the daemon

i have made a python script that performs a nagios check. The functionality of the script is pretty simple it just parses a log and matches some info witch is used to construct the nagios check output. The log is a snmptrapd log witch records the traps from other servers and logs them in /var/log/snmptrapd after witch i just parse them with the script. In order to have the latest traps i erase the log from python each time after reading it. In order to preserve the info i have made a cron job that copies the content of the log into another log at an time interval a bit smaller than the nagios check interval. The thing that i don't understand is why is the log growing so much (i mean the messages log which has i guess 1000 times more info is smaller). From what i've seen in the log there are a lot of special characters like ^@ and i think that this is done by the way i'm manipulating the file from pyton but seeing that i olny have like three weeks of experience with it I can't seem to figure out the problem.

The script code is the following:

import sys, os, re

validstring = "OK"
filename = "/var/log/snmptrapd.log"

if os.stat(filename)[6] == 0:
        print validstring
        sys.exit()

else:
        f = open(filename,"r")
        sharestring = ""
        line1 = []
        patte0 = re.compile("[0-9]+-[0-9]+-[0-9]+")
        patte2 = re.compile("NG: [a-zA-Z\s=0-9]+.*")
        for line in f:
                line1 = line.split(" ")
                if re.search(patte0,line1[0]):
                        sharestring = sharestring + line1[1] + " "
                        continue
                result2 = re.search(patte2,line)
                if result2:
                        result22 = result2.group()
                        result22 = result22.replace("NG:","")
                        sharestring = sharestring + result22 + " "
        f.close()
        f1 = open(filename,"w")
        f1.close()
        print sharestring
        sys.exit(2)

~

The log looks like:

2012-07-11 04:17:16 Some IP(via UDP: [this is an ip]:port) TRAP, SNMP v1, community somestring
    SNMPv2-SMI::enterprises.OID Some info which is not necesarry
    SNMPv2-MIB::sysDescrOID = STRING: info which i'm matching

I'm pretty sure that it has something to do with the my way of erasing the file but i can't figure it out. If you have some idea i would be really interested. Thank you.

As an information about the size i have 93 lines(so says Vim) and the log occupies 161K and that is not ok because the lines are quite short.

OK it has nothing to do with the way i read and erased the file. Is something in the snmptrapd daemon that is doing this when i'm erasing it's log file. I have modified my code and now i send SIGSTOP to snmptrapd reight before i open the file, and i make my modifications to the file and then i send SIGCONT after i'm done but it seem i experience the same behavior. The new code looks like(the different parts):

else:
    command = "pidof snmptrapd"
    p=subprocess.Popen(shlex.split(command),stdout=subprocess.PIPE)
    pidstring = p.stdout.readline()
    patte1 = re.compile("[0-9]+")
    pidnr = re.search(patte1,pidstring)
    pid = pidnr.group()
    os.kill(int(pid), SIGSTOP)
    time.sleep(0.5)
    f = open(filename,"r+")
    sharestring = ""

and

                  sharestring = sharestring + result22 + " "
    f.truncate(0)
    f.close()
    time.sleep(0.5)
    os.kill(int(pid), SIGCONT)
    print sharestring

I'm thinking of stopping the daemon erasing the file and after that recreating it with the proper permissions and starting the daemon.

I don't think you can, but here are some things to try

Truncating a File

f1 = open(filename, 'w')
f1.close()

is a hacky side effect way of deleting a files contents and will probably be causing undesired side effects depending on the underlying OS if other applications have that file open.

Using the File Object method truncate()

truncate([size])

Truncate the file's size. If the optional size argument is present, the file is truncated to (at most) that size. The size defaults to the current position. The current file position is not changed. Note that if a specified size exceeds the file's current size, the result is platform-dependent: possibilities include that the file may remain unchanged, increase to the specified size as if zero-filled, or increase to the specified size with undefined new content. Availability: Windows, many Unix variants.

Probably the only determinist way to do this is

stop the snmptrapd process at the start of the script, use the proper os module function remove and then recreate the file and restart the snmptrapd daemon at the end of the script.

os.remove(path)

Remove (delete) the file path. If path is a directory, OSError is raised; see rmdir() below to remove a directory. This is identical to the unlink() function documented below. On Windows, attempting to remove a file that is in use causes an exception to be raised; on Unix, the directory entry is removed but the storage allocated to the file is not made available until the original file is no longer in use.

Shared resource concern

You still might have problems with having two processes trying to fight for writing to a single file without some kind of locking mechanism and having non-deterministic things happening to the file. I bet you can send a SIGINT or something similar to your daemon process and get it to re-read the file or something, check your documentation.

Manipulating shared resources, especially file resources without exclusive locking is going to be trouble, especially with filesystem caching and application caching of data.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM