简体   繁体   中英

Python how to match regular expression pattern

I want to parse a log and find the below line using regex pattern, eg

r"(  *)"C-R-A-R(  *).( *)"

which doesn't work, how to write this regex pattern? The key is to find CRAR and then some numbers(should be substring) separated by spaces. Please note the spaces between each are several spaces, not only one space.

[0]:      C-R-A-R              4                 1              85.4        86.1        90.8        76.1        92.3          0.000       0.000" 

If we consider this test data:

text = """C-R-A-R              4                 1              85.4        86.1        90.8        76.1        92.3          0.000       0.000
B-D-D-D 0                    0  1 1 2
"""

You want to extract the first line, but not the second because it doesn't start with CRAR (have I understood correctly?)

Try this regular expression

import re

pattern = re.compile(r'( *)(C-R-A-R)(?P<digits>[ \d\.]+)')

Apply the pattern on each line:

matches = [pattern.search(line) for line in text.split('\n')]

Keep only lines that have matched:

matched_lines = [m for m in matches if m is not None]

You get:

print(matched_lines)
>>> [<re.Match object; span=(0, 133), match='C-R-A-R              4                 1         >]

You can then extract the number part of the string for processing if needed, using the group name digits (defined with the syntax ?P<digits> )

digits = matched_lines[0].group('digits').strip()

print(digits)

>>> '4                 1              85.4        86.1        90.8        76.1        92.3          0.000       0.000'



The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM