I want to construct a regular expression for this task with Python 3.7.5. The input texts are like following:
alkdj flajf
123 adlf ja;ld fj 999
423 234 2359 kalfji lkja;lkd999
my goal is retrieve all the numbers in leading positions, a space character after each number, and get a list like following
[]
[123]
[423, 234, 2359]
Any advice is appreciated!
import re
data = '''
alkdj flajf
123 adlf ja;ld fj 999
423 234 2359 kalfji lkja;lkd999
'''
pattern = re.compile(r'([0-9 ]+) \w.*?')
pattern.findall(data)
Outputs:
['123', '423 234 2359']
If you want to capture numbers separately, we could use the fancy \\G
continue operator:
import regex as re
rgx = r"(?|^(\d+)|\G \K(\d+))"
test_str = ("alkdj flajf\n"
"123 adlf ja;ld fj 999\n"
"423 234 2359 kalfji lkja;lkd999")
matches = re.finditer(rgx, test_str, re.MULTILINE)
for match in matches:
print(match.group(1))
Demo (the demo requires PCRE, this is why I import the alternative regex module)
I also use a Branch Reset (?|)
and the \\K
discard operator to make things work.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.