Hi I would like to be able to extract just floats from a string
str = "Test string 1.234 0.155.1 5.67799350,-2.654657
Outcome should be
[1.234, 5.67799350, -2.654657]
I was using [-+]?\d*\.\d+|\d+
but it detect the 0.155.1 which I don't want.
import re
floats = re.findall(r"[-+]?\d*\.\d+|\d+", str)
Thanks for your reading.
I believe you found your code here ? Either way, maybe a negative lookbehind and lookahead will work for you and create a more solid pattern?
(?<!\.)[-+]?\b\d+\.\d+(?!\.)\b
See the Online Demo
Pattern breakdown:
(?<.\.)
- Negative lookbehind for a literal dot. [-+]?
- Optional plus or minus sing. \b
- Word-boundary. \d+\.\d+
- One or more digits, a literal dot and again one or more digits. (?.\.)
- Negative lookahead for a literal dot. \b
- Word-boundary. Python sample code:
import re
str = 'Test string 1.234 0.155.1 5.67799350,-2.654657'
lst = [float(i) for i in re.findall(r'(?<!\.)[-+]?\b\d+\.\d+(?!\.)\b', str)]
print(lst)
Result >>
[1.234, 5.6779935, -2.654657]
Use
[-+]?\b(?<!\d\.)\d+\.\d+\b(?!\.\d)
See proof
Alternative to match floats without integer part ( .59
) and when glued to word characters ( _4.567
):
[-+]?(?<!\d\.)(?<!\d)\d*\.\d+(?!\.?\d)
See another proof
It matches an optional plus/minus, one or more digit, dot, one or more digits, wrapped with word boundaries and not in between digit-dot and dot-digit.
Python :
import re
text = 'Test string 1.234 0.155.1 5.67799350,-2.654657'
print([float(i) for i in re.findall(r"[-+]?\b(?<!\d\.)\d+\.\d+\b(?!\.\d)", text)])
Result:
[1.234, 5.6779935, -2.654657]
try (?<.[?\d])[-+]?(:?\d+(:.\?\d*).|\?\d+)(.![.\d])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.