简体   繁体   English

将python与正则表达式配合使用以进行匹配和邮件

[英]Using python with regex to match and mail

I am trying to write a script that reads a log file that matches all items that have specific strings, ie, today's date, a constant word, and the first number after the word. 我正在尝试编写一个脚本,该脚本读取一个日志文件,该日志文件与具有特定字符串的所有项目相匹配,例如,今天的日期,一个常量单词以及该单词之后的第一个数字。 I am fairly new to Python, and have several issues; 我对Python相当陌生,有几个问题。

  1. How to match different groups using regular expressions without having one variable for each regex match (var1 = date + word + 7 -something, var2 = date + word + 9 -something, and so on) 如何使用正则表达式匹配不同的组,而无需为每个正则表达式匹配使用一个变量(var1 = date + word + 7某物,var2 = date + word + 9 +某物,依此类推)
  2. How to print the matches by group suffixed by another word, such as: 如何按后缀另一个单词的组来打印匹配项,例如:

Group 1 第一组

Geneva:

2017-11-14: Word: 7712742 (1346134)

Group 2 2组

Helsinki:

2017-11-02: Word: 9124741 (912478)

I would then like to write all of these matches and their suffixes to an e-mail. 然后,我想将所有这些匹配及其后缀写入电子邮件。 The current way I have tried to tackle this is: 我目前尝试解决的方法是:

import time
import re
import glob

logpath = glob.glob('C:\\path\\to\\file*.log')[0]
readfile = open(logpath, "r")
daysdate = time.strftime("%Y-%m-%d")
regex = re.compile(daysdate + ".*[Word:] \d.+")
for line in readfile:
    req_id = regex.findall(line)
    for word in req_id:
        #this print shows all regex matches from the log file
        print(req_id,)
...
    mail.Body += "%s\n" % word 
    mail.Send()

Now, this prints out all matches for today's date and sends a mail to the desired user(s), but I am yet to find a way to do it as described above, short of creating and writing to several files, which then are read from. 现在,它会打印出今天日期的所有匹配项,并将邮件发送给所需的用户,但是我还没有找到一种如上所述的方法来进行创建,除非创建并写入了多个文件,然后读取了这些文件从。 This feels like a very non-Pythonic way (and just bad practice for scripting overall). 这感觉像是一种非Python的方式(对于整个脚本来说,这只是一种不好的做法)。

Would one want go about this using several loops, such as for each match where the date is date + word + 7 -something, print("Geneva:\\n", req_id7) , or are there other, better ways to solve this? 是否要使用多个循环来解决此问题,例如对于日期为date + word + 7 -something的每个匹配项, print("Geneva:\\n", req_id7) ,还是有其他更好的方法来解决此问题? The output of such an example gives: 该示例的输出给出:

Geneva: ['2017-11-15 04.03.18: Word: 78271187 (783342)'] 
Geneva: ['2017-11-15 04.03.19: Word: 75612345 (755491)'] 
Geneva: ['2017-11-15 04.03.22: Word: 70145678 (798640)']

Where I would like the output to be: 我想要的输出是:

Geneva: 
['2017-11-15 04.03.18: Word: 78271187 (783342)']
['2017-11-15 04.03.19: Word: 75612345 (755491)']
['2017-11-15 04.03.22: Word: 70145678 (798640)']

I might be wrong but this could be one way to hack the problem: 我可能是错的,但这可能是解决问题的一种方法:

import datetime
date = datetime.datetime.now().strftime('%Y-%m-%d')

regex = re.compile(date + r': Word: (7*.|9.*)')

and then run the normal find command. 然后运行普通的find命令。

你可以试试:

print(req_id.replace("Geneva: ",""),)

The solution I found to this problem was to create arrays and then loop through each regex, where matches were appended to their respective array, ie: 我发现此问题的解决方案是创建数组,然后遍历每个正则表达式,将匹配项附加到它们各自的数组,即:

    geneva = []
    helsinki = []

    for line in readfile:
        for match in re.finditer(daysdate + r'.*Word: (7.{7}|9.{7})', line):
            gen.append(match.group(1))

        for match in re.finditer(daysdate + r'.*Word: (5.{7}|6.{7})', line):
            hel.append(match.group(1))

then using join to write out each region followed by their matches as such: 然后使用join写出每个区域,然后是它们的匹配,例如:

mail.Body = "Helsinki:\n%s\n" % ",\n".join(map(str,hel)) + \
"Geneva:\n%s\n" % ",\n".join(map(str,gen))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM