简体   繁体   中英

Regex strictly match two lines with different endings

I'm trying to match the following text of a logfile:

2019-05-22 03:40:01 INFO  ReporteClaro:194 - Termino de procesar archivo

2019-05-22 03:40:01 INFO  ReporteClaro:208 - Termino de procesar Transaction Report

It contains the same words except for those at the end ( archivo ) and ( Payment Report ).

I've tried this:

[\d]+-[\d]+-[\d]+ [\d]+:[\d]+:[\d]+ INFO  ReporteClaro:[\d]+ - Termino de procesar (archivo|Transaction Report)

But that is an optional match due to the | operator. It means it will match the first or the second line but I strictly need regex to match both of them. I thought something like this but obviously will not run:

[\d]+-[\d]+-[\d]+ [\d]+:[\d]+:[\d]+ INFO  ReporteClaro:[\d]+ - Termino de procesar (archivo&Transaction Report)

PD: I've tried another solution using \\n but is there any way to achieve this same result without repeating?:

[\d]+-[\d]+-[\d]+ [\d]+:[\d]+:[\d]+ INFO  ReporteClaro:[\d]+ - Termino de procesar archivo\n

[\d]+-[\d]+-[\d]+ [\d]+:[\d]+:[\d]+ INFO  ReporteClaro:[\d]+ - Termino de procesar Transaction Report

If "archivo" and "Transaction Report" are the only values that you expect after "Termino de procesar" ie there is nothing like "Termino de procesar Something Else". You could simply do the following.

r"^.+Termino de procesar.+$"gm

demo

Effectively this will get everything from the beginning of the line till the end only if they have the phrase "Termino de procesar" in it.

In the case that there is other log entries that have "Termino de procesar" in them and something you don't want you can use the following.

r"^.+Termino de procesar archivo.*$|^.+Termino de procesar Transaction Report.*$"gm

demo2

I find simplicity to usually be the best solution. No need to explicitly select the datetime stuff or "ReporteClaro" simply use the catch all before to grab it. Easier to understand the regex imo.

Edit: You need the gm modifiers unless you're reading it line by line.

This will get them as a group, and everything in-between.

(?s)[\d]+-[\d]+-[\d]+[ ][\d]+:[\d]+:[\d]+[ ]INFO[ ]ReporteClaro:[\d]+[ ]-[ ]Termino[ ]de[ ]procesar[ ](?:archivo|Transaction[ ]Report)(?:.*?[\d]+-[\d]+-[\d]+[ ][\d]+:[\d]+:[\d]+[ ]INFO[ ]ReporteClaro:[\d]+[ ]-[ ]Termino[ ]de[ ]procesar[ ](?:archivo|Transaction[ ]Report))*  

Readable version

 (?s)

 [\d]+ - [\d]+ - [\d]+ [ ] [\d]+ : [\d]+ : [\d]+ [ ] INFO [ ] ReporteClaro: 
 [\d]+ [ ] - [ ] Termino [ ] de [ ] procesar [ ] 
 (?: archivo | Transaction [ ] Report )

 (?:
      .*? [\d]+ - [\d]+ - [\d]+ [ ] [\d]+ : [\d]+ : [\d]+ [ ] INFO [ ] ReporteClaro: 
      [\d]+ [ ] - [ ] Termino [ ] de [ ] procesar [ ] 
      (?: archivo | Transaction [ ] Report )
 )*

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM