I'm trying to catch all numbers from a string using Python regex. By numbers I mean integers and floats (using ,
or .
). I managed to get it done using this regex: ([0-9]+[\,|\.][0-9]+|[0-9]+)
But I have a problem, I need it to match big numbers with spaces in them. I mean 20 000
or 5 000 000
. And these numbers can be very big with a lot of spaces. I don't know how much. But there will always be 1 space between numbers, no more. For example: 20 30
= this will be 2 differents numbers.
I guess I will need some sort of recursive pattern (?R)
, but I don't know how to use it.
Can someone help? :)
You can use a pattern like
(?<!\d)(?<!\d[.,])\d{1,3}(?:\s\d{3})*(?:[,.]\d+)?
See the regex demo .
Details
(?<?\d)(.<,\d[.,])
- no digit or digit plus a comma or period immediately to the left of the current location are allowed \d{1,3}
- one, two or three digits (?:\s\d{3})*
- zero or more sequences of a whitespace and three digits (?:[,.]\d+)?
- an optional occurrence of a ,
or .
and then one or more digits. In Python , you can use re.findall
:
import re
text = "5 000, 6 123 456,345 and 6 123 456.345... I mean 20 000 or 5 000 000. For example: 20 30"
print( re.findall(r'(?<!\d)(?<!\d[.,])\d{1,3}(?:\s\d{3})*(?:[,.]\d+)?', text) )
## => ['5 000', '6 123 456,345', '6 123 456.345', '20 000', '5 000 000', '20', '30']
import re
number='20 300 4 100 400 50'
res=re.findall(r'(\d*\s*)',number)
res=''.join(res).split(' ')
print(list(map(lambda x: int(x.replace(' ','')),res)))
-output
[20300, 4100, 400, 50]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.