从文件中的行读取多个子字符串

Question

因此，基本上我正在做的是使用python脚本从apache error_log文件中生成报告。 我正在处理的一个示例是：

[Wed Apr 13 18:33:42.521106 2016] [core:notice] [pid 11690] SELinux policy enabled; httpd running as context system_u:system_r:httpd_t:s0
[Wed Apr 13 18:33:42.543989 2016] [suexec:notice] [pid 11690] AH01232: suEXEC mechanism enabled (wrapper: /usr/sbin/suexec)

我试图获得的最终结果将类似于：

core:notice - SELinux policy enabled; httpd running as context system_u:system_r:httpd_t:s0
suexec:notice - AH01232: suEXEC mechanism enabled (wrapper: /usr/sbin/suexec)

错误类型，后跟尾随文本。 然后，我需要将此格式文本写入新文件。

我一直在尝试使用正则表达式来执行此操作，但是距我完全使用python已经有好几年了，并且之前从未使用过正则表达式。 到目前为止，我能得到的最多是隔离第一个（日期）部分，但是我无法弄清楚如何获得后续括号括起来的子字符串和尾随文本。 任何和所有帮助将不胜感激！

Answer 1

由于您的数据正好由四个字段组成，并且除最后一个字段外，每个字段都用漂亮的方括号显示，因此您可以利用这些行为来执行任务，而无需使用Regex这样：

texts = ['[Wed Apr 13 18:33:42.521106 2016] [core:notice] [pid 11690] SELinux policy enabled; httpd running as context system_u:system_r:httpd_t:s0', \
'[Wed Apr 13 18:33:42.543989 2016] [suexec:notice] [pid 11690] AH01232: suEXEC mechanism enabled (wrapper: /usr/sbin/suexec)']
for text in texts:
    words = text.replace('[','').split(']')
    newWords = words[1] + ' -' + words[3]
    print(newWords)

导致：

 core:notice - SELinux policy enabled; httpd running as context system_u:system_r:httpd_t:s0
 suexec:notice - AH01232: suEXEC mechanism enabled (wrapper: /usr/sbin/suexec)

这个想法是先用空字符串替换一个开始的方括号，然后使用封闭的方括号作为参数分割您的单词（因此也将被删除）：

words = text.replace('[','').split(']')

然后，您只需要组合要从中形成新string的字段：

newWords = words[1] + ' -' + words[3]

您完成了。

从文件中的行读取多个子字符串

问题描述

1 个解决方案

解决方案1
2 已采纳 2016-04-14 03:45:57

从文件中的行读取多个子字符串

问题描述

1 个解决方案

解决方案1 2 已采纳 2016-04-14 03:45:57

解决方案1
2 已采纳 2016-04-14 03:45:57