简体   繁体   中英

How to use regex to reference specific parts?

I have a Python string containing information that I want to pull out using regex.

Example:

"The weather is 75 degrees with a humidity of 13%"

I want to just pull out the "75" and the "13." Here is what I've tried so far in Python.

import re

str = "The weather is 75 degrees with a humidity of 13%"
m = re.search("The weather is \d+ degrees with a humidity of \d+%", str)
matched = m.group()

However, this obviously matches the entire string instead of just the parts I want. How do I pull out just the numbers that I want? I've looked into backreferences but it seems to only apply to within the regex pattern itself.

m = re.search("The weather is (\d+) degrees with a humidity of (\d+)%", str)
matched = m.groups()

you need to wrap what you want in parenthesis ...

>>> s1 = "The weather is 75 degrees with a humidity of 13%"
>>> m = re.search("The weather is (\d+) degrees with a humidity of (\d+)%", s1)
>>> m.groups()
('75', '13')

or just use findall to just get the numbers out of any string

>>> re.findall("\d+",s1)
['75', '13']

Maybe you wanted to use named groups?

>>> m = re.search("The weather is (?P<temp>\d+) degrees with a humidity of (?P<humidity>\d+)%", s1)
>>> m.group('temp')
'75'
>>> m.group('humidity')
'13'

When you want to extract typed data from text, such as numbers, parse is an extremely useful library. In many ways, it is the inverse of string formatting. It takes a pattern, and will do type conversions.

At its simplest, it allows you to avoid worrying about regular expression groups and so forth.

>>> s = "The weather is 75 degrees with a humidity of 13%"
>>> parse("The weather is {} degrees with a humidity of {}%", s)
<Result ('75', '13') {}>

The Result object is pretty easy to work with:

>>> r = _
>>> r[0]
'75'

We can do better than this by specifying field names and/or type conversions. Here is all we need to do to have the results as integers:

>>> parse("The weather is {:d} degrees with a humidity of {:d}%", s)
<Result (75, 13) {}>

If we want to use non-index keys, then add field names:

>>> parse("The weather is {temp:d} degrees with a humidity of {humidity:d}%", s)
<Result () {'temp': 75, 'humidity': 13}>
>>> r = _
>>> r['temp']
75

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM