Extract a string between two set of patterns in Python

Question

I am trying to extract a substring between two set of patterns using re.search() .

On the left, there can be either 0x or 0X , and on the right there can be either U , , or \n . The result should not contain boundary patterns. For example, 0x1234U should result in 1234 .

I tried with the following search pattern: (0x|0X)(.*)(U| |\n) , but it includes the left and right patterns in the result.

What would be the correct search pattern?

Answer 1

You could use a combination of lookbehind and lookahead with a non-greedy match pattern in between:

import re
   
pattern = r"(?<=0[xX])(.*?)(?=[U\s\n])"

re.findall(pattern,"---0x1234U...0X456a ")

['1234', '456a']

Answer 2

You could use also use a single group using .group(1)

0[xX](.*?)[U\s]

The pattern matches:

0[xX] Match either 0x or 0X
(.*?) Capture in group 1 matching any character except a newline, as least as possible
[U\s] Match either U or a whitespace characters (which could also match a newline)

Regex demo | Python demo

import re

s = r"0x1234U"
pattern = r"0[xX](.*?)[U\s]"

m = re.search(pattern, s)
if m:
    print(m.group(1))

Output

Extract a string between two set of patterns in Python

Question

2 answers

solution1
1 2021-02-21 19:50:02

solution2
1 ACCPTED 2021-02-22 13:15:51

Extract a string between two set of patterns in Python

Question

2 answers

solution1 1 2021-02-21 19:50:02

solution2 1 ACCPTED 2021-02-22 13:15:51

solution1
1 2021-02-21 19:50:02

solution2
1 ACCPTED 2021-02-22 13:15:51