简体   繁体   English

从python中的日志文件中读取数据

[英]Reading in data from a log file in python

I trying to parse in log entries that have a recurring pattern and write each entry into its own file using python.我试图解析具有重复模式的日志条目,并使用 python 将每个条目写入自己的文件。 All log entries have the general format:所有日志条目都具有一般格式:

 ProcessID= abc . . . . . Size=76 bytes EOE ------------------------------------------------------------------------ StartTime=abc . . . . . Size=76 bytes EOE ------------------------------------------------------------------------ DifferentParameter=abc . . . . . Size=76 bytes EOE ------------------------------------------------------------------------

Each entry has a different number of parameters.每个条目都有不同数量的参数。 Essentially what I need to do is to parse in only 2 of the parameters and map them together, but not every entry has both parameters so my first goal is to split the entries into separate files (or if someone knows of a better way to split entries) and then I will further process each entry using regex or something similar.基本上我需要做的是仅解析 2 个参数并将它们映射在一起,但并非每个条目都具有这两个参数,因此我的第一个目标是将条目拆分为单独的文件(或者如果有人知道更好的拆分方法)条目),然后我将使用正则表达式或类似的东西进一步处理每个条目。

So far I've got the following bit of code to try and parse 10 log entries but I'm not entirely sure how to handle the case of it finding the EOE entry and then moving to the next line.到目前为止,我已经有了以下代码来尝试解析 10 个日志条目,但我不完全确定如何处理它找到 EOE 条目然后移动到下一行的情况。

 rf = open('data.txt', 'r') lines = rf.readLine() rf.close() i = 0 while i != 10: for line in lines: while(line.find('EOE') == -1): with open('data'+(i)+'.txt', 'w') as wf: wf.write(line) file.seek(1,1) i+=1 rf.close()

in my opinion, there are some problems even with the log file.在我看来,即使日志文件也存在一些问题。 You try to split by EOE, but by doing that you will obtain files that have the "-----" line at the beginning and others that don't (in particular the "processID" section will not have the "----" at the beginning).您尝试按 EOE 进行拆分,但是通过这样做,您将获得开头带有“-----”行的文件,而其他文件则没有(特别是“processID”部分将没有“-- ——”开头)。 Therefore, why not to split by the "----"?所以,为什么不用“----”来分割呢? Second problem, empty lines.第二个问题,空行。 Also, in this case, you will have some files that start with empty lines and others that don't.此外,在这种情况下,您将有一些文件以空行开头,而其他文件则没有。 One needs to take this into consideration.人们需要考虑到这一点。

I tried to solve all these problems and obtain files that are in the same format, no empty lines, and that start with the line that contains the "=".我试图解决所有这些问题并获得格式相同、没有空行且以包含“=”的行开头的文件。

I called the input file "log_stack.txt".我将输入文件称为“log_stack.txt”。

Here my humble solution:这是我简陋的解决方案:

with open("log_stack.txt") as f:
    read_data = f.readlines()

s=""
counter=1

f=open("file_{}.txt".format(counter), "w")
for i in read_data:

  if(i.find("---") == -1):
    if(i!="\n"):
      s+=i
  else:
    f.write(s)
    s=""
    f.close()
    counter+=1
    f=open("file_{}.txt".format(counter), "w")
f.close()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM