简体   繁体   English

蟒蛇。 re.findall和re.sub with'^'

[英]python. re.findall and re.sub with '^'

I try to change string like s='2.3^2+3^3-√0.04*2+√4', where 2.3^2 has to change to pow(2.3,2), 3^3 - pow(3,3), √0.04 - sqrt(0.04) and √4 - sqrt(4). 我尝试改变字符串,如s ='2.3 ^ 2 + 3 ^ 3-√0.04* 2 +√4',其中2.3 ^ 2必须改为pow(2.3,2),3 ^ 3 - pow(3,3) ),√0.04 - sqrt(0.04)和√4 - sqrt(4)。

s='2.3^2+3^3-√0.04*2+√4'
patt1='[0-9]+\.[0-9]+\^[0-9]+|[0-9]+\^[0-9]'
patt2='√[0-9]+\.[0-9]+|√[0-9]+'
idx1=re.findall(patt1, s)
idx2=re.findall(patt2, s)
idx11=[]
idx22=[]
for i in range(len(idx1)):
    idx11.append('pow('+idx1[i][:idx1[i].find('^')]+','+idx1[i][idx1[i].find('^')+1:]+')')

for i in range(len(idx2)):
    idx22.append('sqrt('+idx2[i][idx2[i].find('√')+1:]+')')

for i in range(len(idx11)):
    s=re.sub(idx1[i], idx11[i], s)

for i in range(len(idx22)):
    s=re.sub(idx2[i], idx22[i], s)

print(s)

Temp results: 温度结果:

idx1=['2.3^2', '3^3'] idx2=['√0.04', '√4'] idx11=['pow(2.3,2)', 'pow(3,3)'] idx22=['sqrt(0.04)', 'sqrt(4)']

but string result: 但字符串结果:

2.3^2+3^3-sqrt(0.04)*2+sqrt(4)

Why calculating 'idx1' is right, but re.sub don't insert this value into string ? 为什么计算'idx1'是正确的,但是re.sub不会将此值插入字符串中? (sorry for my english:) (对不起我的英语不好:)

Try this using only re.sub() 仅使用re.sub()尝试此操作

Input string: 输入字符串:

s='2.3^2+3^3-√0.04*2+√4'

Replacing for pow() 替换pow()

s = re.sub("(\d+(?:\.\d+)?)\^(\d+)", "pow(\\1,\\2)", s)

Replacing for sqrt() 替换sqrt()

s = re.sub("√(\d+(?:\.\d+)?)", "sqrt(\\1)", s)

Output: 输出:

pow(2.3,2)+pow(3,3)-sqrt(0.04)*2+sqrt(4)

() means group capture and \\\\1 means first captured group from regex match. ()表示组捕获, \\\\1表示从正则表达式匹配中第一个捕获的组。 Using this link you can get the detail explanation for the regex. 使用此链接,您可以获得正则表达式的详细说明。

I've only got python 2.7.5 but this works for me, using str.replace rather than re.sub . 我只有python 2.7.5但这适用于我,使用str.replace而不是re.sub Once you've gone to the effort of finding the matches and constructing their replacements, this is a simple find and replace job: 一旦你努力寻找匹配并构建他们的替代品,这是一个简单的查找和替换工作:

for i in range(len(idx11)):
    s = s.replace(idx1[i], idx11[i])

for i in range(len(idx22)):
    s = s.replace(idx2[i], idx22[i])

edit 编辑

I think you're going about this in quite a long-winded way. 我认为你是以一种啰嗦的方式解决这个问题。 You can use re.sub in one go to make these changes: 您可以一次性使用re.sub进行这些更改:

s = re.sub('(\d+(\.\d+)?)\^(\d+)', r'pow(\1,\3)', s)

Will substitute 2.3^2+3^3 for pow(2.3,2)+pow(3,3) and: 2.3^2+3^3替换为pow(2.3,2)+pow(3,3)和:

s = re.sub('√(\d+(\.\d+)?)', r'sqrt(\1)', s)

Will substitute √0.04*2+√4 to sqrt(0.04)*2+sqrt(4) √0.04*2+√4替换为sqrt(0.04)*2+sqrt(4)

There's a few things going on here that are different. 这里有一些不同的东西。 Firstly, \\d , which matches a digit, the same as [0-9] . 首先, \\d ,匹配一个数字,与[0-9]相同。 Secondly, the ( ) capture whatever is inside them. 其次, ( )捕获它们内部的任何东西。 In the replacement, you can refer to these captured groups by the order in which they appear. 在替换中,您可以按照它们出现的顺序来引用这些捕获的组。 In the pow example I'm using the first and third group that I have captured. pow示例中,我使用的是我捕获的第一组和第三组。

The prefix r before the replacement string means that the string is to be treated as "raw", so characters are interpreted literally. 替换字符串前面的前缀r表示该字符串将被视为“raw”,因此字符按字面解释。 The groups are accessed by \\1 , \\2 etc. but because the backslash \\ is an escape character, I would have to escape it each time ( \\\\1 , \\\\2 , etc.) without the r . 这些组由\\1\\2等访问,但由于反斜杠\\是一个转义字符,我不得不每次都转义它( \\\\1\\\\2 ,等)而没有r

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM