I'm trying to write a Python script to assist with parsing log file to search for timestamps based on a unique ID . File is very long, and tricks I've tried would select everything above key-word line. Ideally I'd like to have a keyword (ID) and a matching regex appended to it for maximum clarity; this I will try to achieve using Python. But could I ask if somebody could help me improve on regex expression for the following code. Regex attempt, that select everything above the _id :
((.*\n){2}).*8355371640847
And the code in question:
...
...
..
..
_ommited everythig: *ignore everything beyond*
createTime: 2020-06-03T16:01:35.812Z --only this line to be selected
employee:
_id: 835537164084782 -- ID that is used as a reference to return 'createTime' two lines above
code: null
...
...
...
try this (([^\n] \n[^\n] \n)).*8355371640847
Good Morning, I do not understand for that I definitely tried it multiple times. But the code:
((.*\n){2}).*8355371640847
actualy does the job; it does select only the line which is two lines above the search string. Yesterday same string selected everything , but it maybe have had to do something how I copy/pasted the DB dump.
Thank you.
Hope you are trying to get this
a = """ _ommited everythig: *ignore everything beyond*
createTime: 2020-06-03T16:01:35.812Z --only this line to be selected
employee:
_id: 835537164084782 -- ID that is used as a reference to return 'createTime' two lines above
code: null """
x = re.compile('([^\n]*\n[^\n]*\n)[^\n]*8355371640847')
print (x.findall(a))
x = re.compile('([^\n]*\n)[^\n]*\n[^\n]*8355371640847')
print (x.findall(a))
output is: [' createTime: 2020-06-03T16:01:35.812Z --only this line to be selected\n employee:\n'] [' createTime: 2020-06-03T16:01:35.812Z --only this line to be selected\n']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.