Python - 从日志文件中提取字符串并将它们写入另一个文件

Question

我有一个像下面这样的日志文件：

sw2 switch_has sw2_p3.
sw1 transmits sw2_p2
/* BUG: axiom too complex: SubClassOf(ObjectOneOf([NamedIndividual(#t_air_sens2)]),DataHasValue(DataProperty(#qos_type),^^(latency,http://www.xcx.org/1900/02/22-rdf-syntax-ns#PlainLiteral))) */
/* BUG: axiom too complex: SubClassOf(ObjectOneOf([NamedIndividual(#t_air_sens2)]),DataHasValue(DataProperty(#topic_type),^^(periodic,http://www.xcx.org/1901/11/22-rdf-syntax-ns#PlainLiteral))) */
...

我感兴趣的是从/* BUG...行中提取特定单词并将它们写入单独的文件中，如下所示：

t_air_sens2 qos_type latency
t_air_sens2 topic_type periodic
...

我可以在 shell 中的awk和正则表达式的帮助下做到这一点，如下所示：

awk -F'#|\\^\\^\\(' '{for (i=2; i<NF; i++) printf "%s%s", gensub(/[^[:alnum:]_].*/,"",1,$i), (i<(NF-1) ? OFS : ORS) }' output.txt > ./LogErrors/Properties.txt

如何使用 Python 提取它们？ （我应该再次使用正则表达式，还是..？）

Answer 1

您当然可以使用正则表达式。 我会逐行阅读，抓取以'/* BUG:'开头的行，然后根据需要解析这些行。

import re

target = r'/* BUG:'
bugs = []
with open('logfile.txt', 'r') as infile, open('output.txt', 'w') as outfile:
    # loop through logfile
    for line in infile:
        if line.startswith(target):
            # add line to bug list and strip newlines
            bugs.append(line.strip())
            # or just do regex parsing here
            # create match pattern groups with parentheses, escape literal parentheses with '\'
            match = re.search(r'NamedIndividual\(([\w#]+)\)]\),DataHasValue\(DataProperty\(([\w#]+)\),\^\^\(([\w#]+),', line)
            # if matches are found
            if match:
                # loop through match groups, write to output
                for group in match.groups():
                    outfile.write('{} '.format(group))
                outfile.write('\n')

Python 内置了一个非常强大的正则表达式模块： re 模块

您可以搜索给定的模式，然后根据需要打印出匹配的组。

注意：原始字符串( r'xxxx' ) 允许您使用未转义的字符。

Answer 2

我尝试了以下方式并获取日志文件的特定行。

target =["BUGS"] # array with specific words

with open('demo.log', 'r') as infile, open('output.txt', 'w') as outfile:

    for line in infile:

        for phrase in target:

            if phrase in line:

                outfile.write('{} '.format(line))

这将输出包含目标中单词的行，并将输出写入 output.txt 文件中。

Python - 从日志文件中提取字符串并将它们写入另一个文件

问题描述

2 个解决方案

解决方案1
1 已采纳 2016-10-02 11:00:23

解决方案2
0 2021-06-15 04:29:19

Python - 从日志文件中提取字符串并将它们写入另一个文件

问题描述

2 个解决方案

解决方案1 1 已采纳 2016-10-02 11:00:23

解决方案2 0 2021-06-15 04:29:19

解决方案1
1 已采纳 2016-10-02 11:00:23

解决方案2
0 2021-06-15 04:29:19