简体   繁体   中英

Python regular expressions over several rows (multiline) in depending of Row 1

I have a log (txt file) with following structure.

At 2020-07-15 14:05:18 - Markers detected in this frame : 3 | 6 | 
ID :6 out of compartment G2A44

or

At 2020-07-15 14:05:47 - Markers detected in this frame : 3 | 0 | 9 | 
ID :9 out of compartment G2A13
ID :9 out of compartment G2A45

See regex .

I need the information of

  1. 2020-07-15 (group1)
  2. 14:05:47 (group2)
  3. ID:9 (group4)
  4. G2A13...

When I have only 1 line below At 2020-07-15 14:05:47 - Markers detected in this frame: 3 | 0 | 9 | At 2020-07-15 14:05:47 - Markers detected in this frame: 3 | 0 | 9 | everything will be caught with the expression expr = 'At ([0-9]{4}-[0-9]{2}-[0-9]{2}) ([0-9]{2}:[0-9]{2}:[0-9]{2}) - Markers detected in this frame: ([0-9]{1,}.{1,})\s(ID..[0-9])\sout of compartment ([\w]{4,})' .

But how can I get a second or third line with with the same group matching in regex?

import re
expr = 'At ([0-9]{4}-[0-9]{2}-[0-9]{2}) ([0-9]{2}:[0-9]{2}:[0-9]{2}) - Markers detected in this frame : ([0-9]{1,} .{1,})\s(ID..[0-9])\sout of compartment ([\w]{4,})'
f = 'XX.txt'
file = open(f,'r')
text = file.read()
m = []
m = re.findall(expr,text, re.MULTILINE)
print(m)

You're asking for a parser. You need a state machine.

Test the target line based on the header expression, and store some values. If it doesn't pass that test, then test the line based on the next expression, and do something with the new matches and the stored values.

Do not expect to get all the lines at once. This is a two-phase job.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM