如何修复此 RegEx 模式，以便提取与此 regex 模式匹配的字符串中所有可能出现的 substring？

Question

我使用此代码的目标是仅替换出现在特定模式之前和之后的 substring （为了建立该模式，我使用了 RegEx）

其实我已经尝试了很多方法都没有得到好的结果，这里我使用compile()方法将输入字符串中找到的RegEx模式编译成正则表达式object（基本上我将出现的'一个一个提取出来我要修改的符合 RegEx 模式条件的子字符串）。

然后我可以简单地使用replace() function 来，原谅冗余，用我想要的 substring 替换提取的子字符串

import re

input_text = "y creo que hay 55 y 6 casas, y quizas alguna mas... yo creo que empezaria entre la 1 ,y las 27"

#the string with which I will replace the desired substrings in the original input string
content_fix = " "

##This is the regex pattern that tries to establish the condition in which the substring should be replaced by the other
#pat = re.compile(r"\b(?:previous string)\s*string that i need\s*(?:string below)?", flags=re.I, )
#pat = re.compile(r"\d\s*(?:y)\s*\d", flags=re.I, )
pat = re.compile(r"\d\s*(?:, y |,y |y )\s*(?:las \d|la \d|\d)", flags=re.I, )

x = pat.findall(input_text)
print(*map(str.strip, x), sep="\n") #it will print the substrings, which it will try to replace in the for cycle
content_list = []
content_list.append(list(map(str.strip, x)))
for content in content_list[0]:
    input_text = input_text.replace(content, content_fix) # "\d y \d"  ---> "\d \d"

print(repr(input_text))

这是我得到的 output：

'y creo que hay 5  casas, y quizas alguna mas... yo creo que empezaria entre la  7'

这是我需要的正确 output ：

'y creo que hay 55 6 casas, y quizas alguna mas... yo creo que empezaria entre la 1 27'

我应该对我的 RegEx 进行哪些更改，以便它提取正确的子字符串并适合此代码的目标？

Answer 1

我想出了一些东西，这是我能得到的最好的:)。 你可能会找到改进它的方法。 重新进口

input_text = "y creo que hay 55 y 6 casas, y quizas alguna mas... yo creo que empezaria entre la 1 ,y las 27"

print(re.sub(r"(?<=\d).+?(?=\d)", " ", input_text))

output 将如下所示：

也许你会找到一种方法来改善表达，或者有人会..

Answer 2

input_text = "y creo que hay 55 y 6 casas, y quizas alguna mas... \
yo creo que empezaria entre la 1 ,y las 27"



re.sub(r'((\d+\s+)y\s+(\d+))| ((\d+\s+),y\s+\w{3}\s+(\d+))', r'\2\3 \5\6', input_text)


y creo que hay 55 6  casas, y quizas alguna mas... yo creo que empezaria entre la 1 27

如何修复此 RegEx 模式，以便提取与此 regex 模式匹配的字符串中所有可能出现的 substring？

问题描述

2 个解决方案

解决方案1
1 2022-08-19 12:46:02

解决方案2
1 已采纳 2022-08-19 18:04:56

如何修复此 RegEx 模式，以便提取与此 regex 模式匹配的字符串中所有可能出现的 substring？

问题描述

2 个解决方案

解决方案1 1 2022-08-19 12:46:02

解决方案2 1 已采纳 2022-08-19 18:04:56

解决方案1
1 2022-08-19 12:46:02

解决方案2
1 已采纳 2022-08-19 18:04:56