[英]Extract occurrence of text between brackets from a text file Python
Log file: 日志文件:
INFO:werkzeug:127.0.0.1 - - [20/Sep/2018 19:40:00] "GET /socket.io/?polling HTTP/1.1" 200 -
INFO:engineio: Received packet MESSAGE, ["key",{"data":{"tag1":12,"tag2":13,"tag3": 14"...}}]
I'm interested in extracting only the text from with in the brackets which contain the keyword "key"
and not all of the occurrences that match the regex pattern from below. 我有兴趣从包含关键字
"key"
的括号中仅提取with中的文本,而不是从下面提取与所有匹配正则表达式模式的匹配项。
Here is what I have tried so far: 这是我到目前为止所尝试的:
import re
with open('logfile.log', 'r') as text_file:
matches = re.findall(r'\[([^\]]+)', text_file.read())
with open('output.txt', 'w') as out:
out.write('\n'.join(matches))
This outputs all of the occurrences that match the regex. 这将输出与正则表达式匹配的所有匹配项。 The desired output to the output.txt would look like this:
output.txt的所需输出如下所示:
"key",{"data":{"tag1":12,"tag2":13,"tag3": 14"...}}
To match text within square brackets that cannot have [
and ]
inside it, but should contain some other text can be matched with a [^][]
negated character class. 要匹配方括号中不能包含
[
和]
文本,但应包含其他一些文本,可以与[^][]
否定字符类匹配。
That is, you may match the whole text within square brackets with \\[[^][]*]
, and if you need to match some text inside, you need to put that text after [^][]*
and then append another occurrence of [^][]*
before the closing ]
. 也就是说,您可以将方括号内的整个文本与
\\[[^][]*]
匹配,如果您需要匹配内部的某些文本,则需要将该文本放在[^][]*
后面,然后追加另一个的发生[^][]*
之前闭合]
。
You may use 你可以用
re.findall(r'\[([^][]*"key"[^][]*)]', text_file.read())
See the Python demo : 查看Python演示 :
import re
s = '''INFO:werkzeug:127.0.0.1 - - [20/Sep/2018 19:40:00] "GET /socket.io/?polling HTTP/1.1" 200 -
INFO:engineio: Received packet MESSAGE, ["key",{"data":{"tag1":12,"tag2":13,"tag3": 14"...}}]'''
print(re.findall(r'\[([^][]*"key"[^][]*)]', s))
Output: 输出:
['"key",{"data":{"tag1":12,"tag2":13,"tag3": 14"...}}']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.