简体   繁体   中英

How to retrieve block of string data using python regular expression

I have python string and contents are shown below:

Using '/tmp' as temporary  location
GNU gdb (GDB) 8.3.0.20190826-git
Copyright (C) 2019 Free Software Foundation, Inc.
Type "show copying" and "show warranty" for details.

(gdb) #0  snp
#3 0x081fc9bc in main (argc=<optimized out>, argv=0xffffde44) at ../../../../../../.
(gdb) quit

jdebug version: 5.0.1
[File is compressed. This may take a moment...]

The only part i want to retrieve is every thing between (gdb) -to- (gdb)quit

Meaning, final out put i am looking is:

(gdb) #0  snp
#3 0x081fc9bc in main (argc=<optimized out>, argv=0xffffde44) at ../../../../../../.
(gdb) quit

Python code which is not working:

with open('st.txt', 'r') as file:
    data = file.read()
print(re.search(r'(gdb).*(gdb) quit', data))

Any idea how can i extract this string using correct regular expression?

The answer below makes sure that the (gdb) strings appear at the beginning of a line and that the quit appears at the end of a line. The pattern is not greedy (that is, it will match the shortest matching string, not the longest).

Your initial regex did not escape the parentheses around gdb which means it was being processed as a regex capture group and not as a character in the text.

import re

in_str = """Using '/tmp' as temporary  location
GNU gdb (GDB) 8.3.0.20190826-git
Copyright (C) 2019 Free Software Foundation, Inc.
Type "show copying" and "show warranty" for details.

(gdb) #0  snp
#3 0x081fc9bc in main (argc=<optimized out>, argv=0xffffde44) at ../../../../../../.
(gdb) quit

jdebug version: 5.0.1
[File is compressed. This may take a moment...]"""

m = re.search(r'^\(gdb\).*?^\(gdb\) quit$', in_str, re.DOTALL | re.MULTILINE)
if m:
    print(m.group(0))

Here is a solution without regex,

text = """Using '/tmp' as temporary  location
GNU gdb (GDB) 8.3.0.20190826-git
Copyright (C) 2019 Free Software Foundation, Inc.
Type "show copying" and "show warranty" for details.

(gdb) #0  snp
#3 0x081fc9bc in main (argc=<optimized out>, argv=0xffffde44) at ../../../../../../.
(gdb) quit

jdebug version: 5.0.1
[File is compressed. This may take a moment...]"""

s, e = '(gdb)', '(gdb) quit'

text[text.index(s) : text.rindex(e) + len(e)]

(gdb) #0  snp
#3 0x081fc9bc in main (argc=<optimized out>, argv=0xffffde44) at ../../../../../../.
(gdb) quit

timing info

text[text.index(s) : text.rindex(e) + len(e)]

636 ns ± 27.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

re.search(r'^\(gdb\).*?^\(gdb\) quit$', text, re.DOTALL | re.MULTILINE)

6.91 µs ± 360 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM