简体   繁体   English

用 Python 解析文本文件?! txt单词的独特模式

[英]Text File Parsing with Python?! unique pattern of txt words

I am trying to parse a series of messages from the text file and save them as txt files using Python (2.7.3) or any other python versions.我正在尝试解析来自文本文件的一系列消息,并使用 Python (2.7.3) 或任何其他 python 版本将它们保存为 txt 文件。

I have txt file like this.txt:我有像this.txt这样的txt文件:

[#11:23][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
INFO isn't NULL
[#12:25][PERFECT][0x0015a] process returned as NULL load index[1] , length[20] , type[0]
[#13:3][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
PERFECT isn't NULL
[#4:23][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
Time is here [Tick:135055] , Time:  17, index: 608, CastedType:20002, area :0
[#15:23][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
[#16:25][PERFECT][0x0015a] process returned as NULL load index[1] , length[20] , type[0]
[#17:3][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
[#8:23][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
time is here [Tick:135055] , Time:  17, index: 608, CastedType:20002, area :0
[#16:23][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
[#14:25][PERFECT][0x0015a] process returned as NULL load index[1] , length[20] , type[0]
[#18:3][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
[#6:23][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
Time is here [Tick:135055] , Time:  17, index: 608, CastedType:20002, area :0

this is the type formats of all rows that txt have, so each row is repeated on given txt file and it has its own unique pattern as I showed above, where the key words [INFO] , [PERFECT] are not changed per the message those key words values are not changed in this message pattern.这是 txt 具有的所有行的类型格式,因此每一行在给定的 txt 文件上重复,并且它有自己独特的模式,如我上面所示,其中关键字[INFO][PERFECT]不会根据消息更改在此消息模式中,这些关键字值不会更改。 consider each row is a new message, so at each row there is a new message starts.考虑每一行都是一条新消息,因此在每一行都有一条新消息开始。

what Im trying to implement in python a function that reads line by line the txt file and all rows there has this types of patterns as I mentioned above and to dump all rows in this certain type:我试图在 python 中实现的是一个 function,它逐行读取 txt 文件,并且那里的所有行都有我上面提到的这种类型的模式,并以这种特定类型转储所有行:

[#12:25][PERFECT][0x0015a] process returned as NULL load index[1] , length[20] , type[0]

to another txt file.到另一个txt文件。 so if I go to another txt file I shall see all rows there has this type of messages:因此,如果我将 go 转到另一个 txt 文件,我将看到那里的所有行都有这种类型的消息:

[#12:25][PERFECT][0x0015a] process returned as NULL load index[1] , length[20] , type[0]

Now after sniffing this type of message from the given txt(input txt), I need to read line by line the new txt file that I generated that has the certain message type and then take the load index values and dump them in another txt file that has just the values of load index.现在,在从给定的 txt(输入 txt)中嗅出这种类型的消息后,我需要逐行读取我生成的具有特定消息类型的新 txt 文件,然后获取加载索引值并将它们转储到另一个 txt 文件中这只是负载指数的值。

So in my example above I shall get like this:所以在我上面的例子中,我会得到这样的:

Given txt file:(this is.txt file as input)给定txt文件:(这是.txt文件作为输入)

[#11:23][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
INFO isn't NULL
[#12:25][PERFECT][0x0015a] process returned as NULL load index[1] , length[20] , type[0]
[#13:3][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
PERFECT isn't NULL
[#4:23][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
Time is here [Tick:135055] , Time:  17, index: 608, CastedType:20002, area :0
[#15:23][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
[#16:25][PERFECT][0x0015a] process returned as NULL load index[1] , length[20] , type[0]
[#17:3][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
[#8:23][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
time is here [Tick:135055] , Time:  17, index: 608, CastedType:20002, area :0
[#16:23][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
[#14:25][PERFECT][0x0015a] process returned as NULL load index[1] , length[20] , type[0]
[#18:3][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
[#6:23][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
Time is here [Tick:135055] , Time:  17, index: 608, CastedType:20002, area :0

Results/output of the function: function 的结果/输出:

  1. generating txt file that has all rows of the certain pattern that I explained above (all rows that has word [PERFECT] so the generated txt file shall be having all messages / rows that has [PERFECT] :生成具有我上面解释的特定模式的所有行的 txt 文件(所有具有单词[PERFECT]的行,因此生成的 txt 文件应具有所有具有[PERFECT]的消息/行:

    [#12:25] [PERFECT] [0x0015a] process returned as NULL load index[1], length[20], type[0] [#16:25] [PERFECT] [0x0015a] process returned as NULL load index[1], length[20], type[0] [#14:25] [PERFECT] [0x0015a] process returned as NULL load index[1], length[20], type[0] [#12:25] [PERFECT] [0x0015a] 进程返回为 NULL 加载索引 [1],长度 [20],类型 [0] [#16:25] [PERFECT] [0x0015a] 进程返回为 Z6C3E226B4D4795D518AB341B0824 加载索引 [ 1], length[20], type[0] [#14:25] [PERFECT] [0x0015a] 进程返回为 NULL 加载索引[1], length[20], type[0]

  2. Then generating a another new txt file for the load index values which in my case load index values found inside [ ] of the word load index ( load index [value] ), so the function shall dump in new txt file the values of the load index as column into the another new generated txt file:然后为负载索引值生成另一个新的 txt 文件,在我的情况下,负载索引值在单词负载索引 ( load index [value] ) 的 [ ] 内找到,因此 function 应在新的 txt 文件中转储负载的值索引作为列到另一个新生成的 txt 文件中:

1 1 1 1 1 1

How to parse in python a text file containing this patterns and message rows as I explained above?如上所述,如何在 python 中解析包含此模式和消息行的文本文件?

In simple words, I want to run row by row(message by message) over the given txt file with the message patterns as I explained above, then parsing into new txt file all the messages that has the keyword [PERFECT] with Brackets, so I will have in new generated txt file only messages that has keyword [PERFECT].简而言之,我想使用上面解释的消息模式逐行(逐个消息)运行给定的txt文件,然后将所有具有关键字[PERFECT]和括号的消息解析到新的txt文件中,所以我将在新生成的 txt 文件中仅包含关键字 [PERFECT] 的消息。 Now after having this new generated file that has only sniffed the messages that has keyword [PERFECT] then to loop and pass over each message in this new generated file (that has the sniffed messages with unique pattern [PERFECT] ) to get the values of the load index [value] that's appear in each message as in my case it's 1 1 1 since load index [1] appear as 1 in three messages.现在有了这个新生成的文件,它只嗅探了具有关键字 [PERFECT] 的消息,然后循环并传递这个新生成的文件中的每条消息(具有唯一模式 [PERFECT] 的嗅探消息)以获得值出现在每条消息中的负载索引 [值] 在我的情况下是 1 1 1,因为负载索引 [1] 在三条消息中显示为 1。 the load index values shall be dumped in another new txt file that has as column the values of load index.负载索引值应转储到另一个新的 txt 文件中,该文件以负载索引值作为列。

thanks alot for any cooperation !非常感谢您的合作!

def get_statuses(s, t):
    statuses = []
    for line in s.splitlines():
        if line.startswith("[#"):
            meta, content = line.split(" ", 1)
            time, status, code = meta.split("][")
            time, code = time[2:], code[:-1]
            index = re.search(r'(index\[)(\d+)(\])', content).group(2)
            if status == t:
                statuses.append({
                    'time': time, 'code': code, 'content': content, 'index': index
                })
    return statuses

It will output:它将 output:

[{'time': '12:25',
  'code': '0x0015a',
  'content': 'process returned as NULL load index[1] , length[20] , type[0]',
  'index': '1'},
 {'time': '16:25',
  'code': '0x0015a',
  'content': 'process returned as NULL load index[1] , length[20] , type[0]',
  'index': '1'},
 {'time': '14:25',
  'code': '0x0015a',
  'content': 'process returned as NULL load index[1] , length[20] , type[0]',
  'index': '1'}]

You can use function output for csv.DictWriter() .您可以将 function output 用于csv.DictWriter()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM