简体   繁体   English

regexr.com的正则表达式在python中不起作用

[英]regular expression from regexr.com doesn't work in python

I have such a function: 我有这样的功能:

def get_temperature(s):
        parts = re.findall(r'([+-]?\d+(\.\d+)*)\s?°([CcFf])', s)
        for a in range(len(parts)):
                s = s.replace(parts[a], "qTEMPq")
        return s

The input parameter s for the function is a string value. 该函数的输入参数s是一个字符串值。 The output parameter is also a string value. 输出参数也是一个字符串值。

So at the end, if I have a string like "It is +25°C outside." 所以最后,如果我输入的字符串是"It is +25°C outside." as an input, the output string will be "It is qTEMPq outside." 作为输入,输出字符串将为"It is qTEMPq outside."

The regular expression I am using from extracting temperature degrees (celcius or fahrenheit) from string finds the subparts similar to (40°F, +30°C, -35 °C, etc.). 从字符串中提取温度度(摄氏度或华氏度)时使用的正则表达式会发现类似于(40°F,+ 30°C,-35°C等)的子部分。 It works perfectly in regexr.com , but not in my code. 它在regexr.com可以完美regexr.com ,但在我的代码中却不能。

What might be the problem, and how can I solve it? 可能是什么问题,我该如何解决?

If you have more than 1 group (...) in your regex, findall will return a list of tuples. 如果您的正则表达式中有多个(...)组,则findall将返回一个元组列表。

If you want to obtain a list of strings, you can make the groups non-capturing using (?:...), as in: 如果要获取字符串列表,可以使用(?:...)使组不捕获,如下所示:

import re
def get_temperature(s):
        parts = re.findall(r'(?:[+-]?\d+(?:\.\d+)*)\s?°(?:[CcFf])', s)
        for a in range(len(parts)):
                s = s.replace(parts[a], "qTEMPq")
        return s
get_temperature('40.5°F')
# 'qTEMPq'
get_temperature('100°F is nearly 37°C')
# 'qTEMPq is nearly qTEMPq'
get_temperature("It is +25°C outside.")
# 'It is qTEMPq outside.'

If what you want is to access the parts of the temperature, you could do (in order to have tuples with value and unit): 如果要访问温度的各个部分,则可以这样做(以使元组具有值和单位):

def get_temperature(s):
        parts = re.findall(r'([+-]?\d+(?:\.\d+)*)\s?°([CcFf])', s)
        return parts

get_temperature("It is +25°C outside.")
#[('+25', 'C')]

Or, if you just want to have the whole temperature as a string: 或者,如果您只想将整个温度作为字符串:

def get_temperature(s):
        parts = re.findall(r'(?:[+-]?\d+(?:\.\d+)*)\s?°(?:[CcFf])', s)
        return parts
get_temperature('100°F is nearly 37°C')
# ['100°F', '37°C']
import re
def get_temperature(s):
    return re.sub(r'[+-]?\d+\.*\d*\s?°[CcFf]', 'qTEMPq', s)

Is this what you're looking for? 这是您要找的东西吗?

I have solved the problem, by using " \\xb0 " instead of " ° ". 我已经解决了问题,方法是使用“ \\xb0 ”而不是“ ° ”。 It was an encoding issue. 这是一个编码问题。 So basically, instead of using '[+-]?\\d+\\.*\\d*\\s?°[CcFf]' expression, I have used '[+-]?\\d+\\.*\\d*\\s?\\xb0[CcFf]' . 因此,基本上,我不是使用'[+-]?\\d+\\.*\\d*\\s?°[CcFf]'表达式,而是使用了'[+-]?\\d+\\.*\\d*\\s?\\xb0[CcFf]'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM