I'm able to get the numbers out of this string:
string_p= 'seven 5 blah 6 decimal 6.5 thousands 8,999 with dollar signs $9,000 and $9,500,001.45 end ... lastly.... 8.4% now end
with this code:
import re
def extractVal2(s,n):
if n > 0:
return re.findall(r'[0-9$,.%]+\d*', s)[n-1]
else:
return re.findall(r'[0-9$,.%]+\d*', s)[n]
for i in range(1,7):
print extractVal2(string_n,i)
but I can't do negative numbers with it. The negative numbers are the ones in the parenthesis.
string_n= 'seven (5) blah (6) decimal (6.5) thousands (8,999) with dollar signs $(9,000) and $(9,500,001.45) end lastly.... (8.4)% now end'
I have tried to first replace the ()
with a negative sign like so
string_n= re.sub(r"\\((\\d*,?\\d*)\\)", r"-\\1", string_n)
then these to get the negative number
r'[0-9$,.%-]+\d*', s)[n]
r'[0-9$,.%]+-\d*', s)[n]
r'[-0-9$,.%]+-\d*', s)[n]
And even using a different approach:
words = string_n.split(" ")
for i in words:
try:
print -int(i.translate(None,"(),"))
except:
pass
You could change your regex to this:
import re
def extractVal2(s,n):
try:
pattern = r'\$?\(?[0-9][0-9,.]*\)?%?'
if n > 0:
return re.findall(pattern, s)[n-1].replace("(","-").replace(")","")
else:
return re.findall(pattern, s)[n].replace("(","-").replace(")","")
except IndexError as e:
return None
string_n= ',seven (5) blah (6) decimal (6.5) thousands (8,999) with dollar ' + \
'signs $(9,000) and $(9,500,001.45) end lastly.... (8.4)%'
for i in range(1,9):
print extractVal2(string_n,i)
It would parse the 9,500,001.45
as well - and captures a leading (
after $
and before numbers and replaces it with a -
sign. Its a hack though - it does not "see" if your (
is without an )
and would also capture "illegal" numbers like 2,200.200,22
.
Output:
-5
-6
-6.5
-8,999
$-9,000
$-9,500,001.45
-8.4%
None
You also should maybe think about catching the IndexError
if your re.findall(..)
does not capture anything (or too few ones) - and you are indexing behind the list returned.
The regex allows:
leading literal $ (not interpreded as ^...$ end of string)
optional literal (
[0-9] one digit
[0-9,.%]* any number (maybe 0 times) of the included characters in any order
to the extend that it would mach smth like 000,9.34,2
optional literal )
optional literal %
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.