[英]Python - Extract strings from a log file and write them into another file
我有一个像下面这样的日志文件:
sw2 switch_has sw2_p3.
sw1 transmits sw2_p2
/* BUG: axiom too complex: SubClassOf(ObjectOneOf([NamedIndividual(#t_air_sens2)]),DataHasValue(DataProperty(#qos_type),^^(latency,http://www.xcx.org/1900/02/22-rdf-syntax-ns#PlainLiteral))) */
/* BUG: axiom too complex: SubClassOf(ObjectOneOf([NamedIndividual(#t_air_sens2)]),DataHasValue(DataProperty(#topic_type),^^(periodic,http://www.xcx.org/1901/11/22-rdf-syntax-ns#PlainLiteral))) */
...
我感兴趣的是从/* BUG...
行中提取特定单词并将它们写入单独的文件中,如下所示:
t_air_sens2 qos_type latency
t_air_sens2 topic_type periodic
...
我可以在 shell 中的awk
和正则表达式的帮助下做到这一点,如下所示:
awk -F'#|\\^\\^\\(' '{for (i=2; i<NF; i++) printf "%s%s", gensub(/[^[:alnum:]_].*/,"",1,$i), (i<(NF-1) ? OFS : ORS) }' output.txt > ./LogErrors/Properties.txt
如何使用 Python 提取它们? (我应该再次使用正则表达式,还是..?)
您当然可以使用正则表达式。 我会逐行阅读,抓取以'/* BUG:'
开头的行,然后根据需要解析这些行。
import re
target = r'/* BUG:'
bugs = []
with open('logfile.txt', 'r') as infile, open('output.txt', 'w') as outfile:
# loop through logfile
for line in infile:
if line.startswith(target):
# add line to bug list and strip newlines
bugs.append(line.strip())
# or just do regex parsing here
# create match pattern groups with parentheses, escape literal parentheses with '\'
match = re.search(r'NamedIndividual\(([\w#]+)\)]\),DataHasValue\(DataProperty\(([\w#]+)\),\^\^\(([\w#]+),', line)
# if matches are found
if match:
# loop through match groups, write to output
for group in match.groups():
outfile.write('{} '.format(group))
outfile.write('\n')
Python 内置了一个非常强大的正则表达式模块: re 模块
注意: 原始字符串( r'xxxx'
) 允许您使用未转义的字符。
我尝试了以下方式并获取日志文件的特定行。
target =["BUGS"] # array with specific words
with open('demo.log', 'r') as infile, open('output.txt', 'w') as outfile:
for line in infile:
for phrase in target:
if phrase in line:
outfile.write('{} '.format(line))
这将输出包含目标中单词的行,并将输出写入 output.txt 文件中。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.