简体   繁体   English

python在编写文件时使用文件

[英]python working with files as they are written

So I'm trying to create a little script to deal with some logs. 因此,我正在尝试创建一个小的脚本来处理一些日志。 I'm just learning python, but know about loops and such in other languages. 我只是在学习python,但是了解其他语言中的循环等。 It seems that I don't understand quite how the loops work in python. 看来我不太了解Python循环的工作原理。

I have a raw log from which I'm trying to isolate just the external IP addresses. 我有一个原始日志,试图从中隔离外部IP地址。 An example line: 示例行:

05/09/2011 17:00:18 192.168.111.26 192.168.111.255 Broadcast packet dropped udp/netbios-ns 0 0 X0 0 0 N/A 2011年5月9日17:00:18 192.168.111.26 192.168.111.255广播数据包被丢弃udp / netbios-ns 0 0 X0 0 0不适用

And heres the code I have so far: 这里是我到目前为止的代码:

import os,glob,fileinput,re

def parseips():
    f = open("126logs.txt",'rb')
    r = open("rawips.txt",'r+',os.O_NONBLOCK)

    for line in f:
        rf = open("rawips.txt",'r+',os.O_NONBLOCK)
        ip = line.split()[3]
        res=re.search('192.168.',ip)
        if not res:
            rf.flush()
            for line2 in rf:
                if ip not in line2:
                    r.write(ip+'\n')
                    print 'else write'
                else:
                    print "no"
    f.close()
    r.close()
    rf.close()  

parseips()

I have it parsing out the external ip's just fine. 我已经解析出外部IP就好了。 But, thinking like a ninja, I thought how cool would it be to handle dupes? 但是,像忍者一样思考,我想处理骗子会有多酷? The idea or thought process was that I can check the file that the ips are being written to against the current line for a match, and if there is a match, don't write. 想法或思考过程是,我可以对照当前行检查正在写入ips的文件是否匹配,如果匹配,则不要写。 But this produces many more times the dupes than before :) I could probably use something else, but I'm liking python and it makes me look busy. 但这产生的欺骗比以前多了很多:)我可能还可以使用其他东西,但是我喜欢python,这使我看起来很忙。

Thanks for any insider info. 感谢您提供任何内部信息。

DISCLAIMER: Since you are new to python, I am going to try to show off a little, so you can lookup some interesting "python things". 免责声明:由于您是python的新手,所以我将尝试炫耀一下,以便您可以查找一些有趣的“ python事物”。

I'm going to print all the IPs to console: 我将打印所有IP到控制台:

def parseips():
    with open("126logs.txt",'r') as f:
        for line in f:
            ip = line.split()[3]
            if ip.startswith('192.168.'):
                print "%s\n" %ip, 

You might also want to look into: 您可能还需要调查:

f = open("126logs.txt",'r')
IPs = [line.split()[3] for line in f if line.split()[3].startswith('192.168.')]

Hope this helps, Enjoy Python! 希望这会有所帮助,享受Python!

Something along the lines of this might do the trick: 遵循以下方法可以解决问题:

import os,glob,fileinput,re

def parseips():
    prefix = '192.168.'
    #preload partial IPs from existing file.
    if os.path.exists('rawips.txt'):
        with open('rawips.txt', 'rt') as f:
            partial_ips = set([ip[len(prefix):] for ip in f.readlines()])
    else:
        partial_ips = set()

    with open('126logs.txt','rt') as input, with open('rawips.txt', 'at') as output:
        for line in input:
            ip = line.split()[3]
            if ip.startswith(prefix) and not ip[len(prefix):] in partial_ips:
                partial_ips.add(ip[len(prefix):])
                output.write(ip + '\n')

parseips()

Rather than looping through the file you're writing, you might try just using a set . 与其遍历正在编写的文件,不如尝试使用set It might consume more memory, but your code will be much nicer, so it's probably worth it unless you run into an actual memory constraint. 它可能会消耗更多的内存,但是您的代码会更好,因此,除非您遇到实际的内存约束,否则可能值得这样做。

Assuming you're just trying to avoid duplicate external IPs, consider creating an additional data structure in order to keep track of which IPs have already been written. 假设您只是在尝试避免重复使用外部IP,请考虑创建其他数据结构,以跟踪已写入的IP。 Since they're in string format, a dictionary would be good for this. 由于它们是字符串格式,因此字典可以很好地解决此问题。

externalIPDict = {}
#code to detect external IPs goes here- when you get one;
if externalIPString in externalIPDict:
    pass # do nothing, you found a dupe
else:
    externalIPDict[externalIPDict] = 1
    #your code to add the external IP to your file goes here

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM