简体   繁体   中英

Extract Floats from a String - Regex - Python

Hi I would like to be able to extract just floats from a string

str = "Test string 1.234 0.155.1 5.67799350,-2.654657

Outcome should be

[1.234, 5.67799350, -2.654657]

I was using [-+]?\d*\.\d+|\d+ but it detect the 0.155.1 which I don't want.

import re
floats = re.findall(r"[-+]?\d*\.\d+|\d+", str)

Thanks for your reading.

I believe you found your code here ? Either way, maybe a negative lookbehind and lookahead will work for you and create a more solid pattern?

(?<!\.)[-+]?\b\d+\.\d+(?!\.)\b

See the Online Demo


Pattern breakdown:

  • (?<.\.) - Negative lookbehind for a literal dot.
  • [-+]? - Optional plus or minus sing.
  • \b - Word-boundary.
  • \d+\.\d+ - One or more digits, a literal dot and again one or more digits.
  • (?.\.) - Negative lookahead for a literal dot.
  • \b - Word-boundary.

在此处输入图像描述


Python sample code:

import re 
str = 'Test string 1.234 0.155.1 5.67799350,-2.654657'
lst = [float(i) for i in re.findall(r'(?<!\.)[-+]?\b\d+\.\d+(?!\.)\b', str)]
print(lst)

Result >>

[1.234, 5.6779935, -2.654657]

Use

[-+]?\b(?<!\d\.)\d+\.\d+\b(?!\.\d)

See proof

Alternative to match floats without integer part ( .59 ) and when glued to word characters ( _4.567 ):

[-+]?(?<!\d\.)(?<!\d)\d*\.\d+(?!\.?\d)

See another proof

It matches an optional plus/minus, one or more digit, dot, one or more digits, wrapped with word boundaries and not in between digit-dot and dot-digit.

Python :

import re 
text = 'Test string 1.234 0.155.1 5.67799350,-2.654657'
print([float(i) for i in re.findall(r"[-+]?\b(?<!\d\.)\d+\.\d+\b(?!\.\d)", text)])

Result:

[1.234, 5.6779935, -2.654657]

try (?<.[?\d])[-+]?(:?\d+(:.\?\d*).|\?\d+)(.![.\d])

demo

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM