I am trying to find a pattern using re to find a sequence of numbers followed by some key words.
string =" 12390 total income stated in red followed by 567 total income stated in blue."
pattern = re.match("\s*\d{1,2}\s* total income",string)
I tried the pattern , but it is not good. I want at the end to get these results: "12390 total income" and "567 total income".
You need to use re.findall
and change the pattern \\d{1,2}
to \\d+
( one or more digit chars ), since \\d{1,2}
should match a min of 1 and max of 2 digits only.
result = re.findall(r"\d+ total income",string)
Note that match
tries to match from the begining of the string where findall
should do a global match.
If you have several space (say 1 or 2 etc) between number and total income in that case use non-capturing group construct .
Say string is
string = '12390total income stated in red followed by 567 total income stated in blue.'
Then try as below
myresult = re.findall(r"\d+(?:\s*?total income)",string)
Extracts
['12390total income', '567 total income']
Then use replace
to remove extra space.
enter code here
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.