[英]Set regex pattern that if said pattern is validated in a string, a part of it is replaced by another substring
How to fix this regex so that with these input strings I get these outputs...如何修复这个正则表达式,以便使用这些输入字符串我得到这些输出......
out = re.sub(r"(hs|h.s|h.s.)a m(\W|\b)", r"\1 am\2", out)
print(repr(out))
Input string examples...输入字符串示例...
#example 1.1
colloquial_hour = "Cerca de las 2: hs a m, hay que salir antes de esas hs a m"
#example 1.2
colloquial_hour = "A medida que avance cerca de la media noche 12: 04 hs a m. Deben ir a las 15 hs a m."
#example 1.3
colloquial_hour = "A mmm... cerca de las 12: h.s a m, hay que salir antes de esas h.s. a m"
#example 1.4
colloquial_hour = "A medida que avance cerca de las 12:04 hs. a m. Deben ir a las 15 h.s a m."
correct outputs:正确的输出:
#correct output for example 1.1
"Cerca de las 2: hs am, hay que salir antes de esas hs a m"
#correct output for example 1.2
"A medida que avance cerca de la media noche 12: 04 hs am. Deben ir a las 15 hs am."
#correct output for example 1.3
"A mmm... cerca de las 12: h.s am, hay que salir antes de esas h.s. a m"
#correct output for example 1.4
"A medida que avance cerca de las 12:04 hs. am. Deben ir a las 15 h.s am."
The logic should work that su will do a numeric value and then an "am"
replace that "am"
substring with this string "am"
in the original string.逻辑应该起作用,su 将执行一个数值,然后一个
"am"
用原始字符串中的这个字符串"am"
”替换那个"am"
substring。
These would be all the possible cases where you have to replace the substring "am" with "am"这些将是您必须将 substring “am”替换为“am”的所有可能情况
X a m
X: a m
X: hs a m
X: h.s. a m
X: h.s a m
X: hs. a m
X: a m
X : hs a m
X : h.s. a m
X : h.s a m
X : hs. a m
X hs a m
X h.s. a m
X h.s a m
X hs. a m
#where "X" is a numerical value ("1", "2", "3", "4", "5", "6", ... )
#in all these cases, in which this pattern is met, "a m" must be replaced by "am"
You could match:你可以匹配:
(\d+\s*:?\s*(?:h\.?s\.?)?)\s*a m\b
The pattern matches:模式匹配:
(
Capture group 1 (
捕获组 1
\d+\s*:?\s*
match 1+ digits and an optional :
between optional whitespace chars \d+\s*:?\s*
匹配 1+ 位数字和可选的:
可选空白字符之间(?:h\.?s\.?)?
Optionally match hm
hs
hs.
hm
hs
hs.
hs
)
Close group 1 )
关闭第 1 组\s*am\b
Match optional whitespace chars and am
\s*am\b
匹配可选的空白字符和am
And replace with group 1 followed by am
并替换为第 1 组,然后是
am
\1 am
See a regex demo and a Python demo查看正则表达式演示和Python 演示
You can search using regex:您可以使用正则表达式进行搜索:
(\d\W+)(h\.?s\.?\s+)?a\s+m\b
and replace using:并替换使用:
\1\2am
RegEx Details:正则表达式详细信息:
(\d\W+)
: Match a digit followed by 1+ non-word char in capture group #1 (\d\W+)
:在捕获组 #1 中匹配一个数字后跟 1+ 个非单词字符(h\.?s\.?\s+)?
: Match h
followed by s
with optional dots after them. h
后跟s
,后面有可选的点。 This optional group is capture group #2a\s+m\b
: Match a
followed by 1+ whitespaces then m
with a word boundary a\s+m\b
:匹配a
后跟 1+ 个空格,然后匹配m
与单词边界My solution uses re.sub
我的解决方案使用
re.sub
import re
phrases = ["Cerca de las 2: hs a m, hay que salir antes de esas hs a m",
"A medida que avance cerca de la media noche 12: 04 hs a m. Deben ir a las 15 hs a m.",
"A mmm... cerca de las 12: h.s a m, hay que salir antes de esas h.s. a m",
"A medida que avance cerca de las 12:04 hs. a m. Deben ir a las 15 h.s a m."]
pattern = re.compile(r'\d\s*?:?\s*?h?\.?s?\.?\s(a m)')
for phrase in phrases:
print(pattern.sub(lambda x: x.group(0)[:-3] + "am", phrase))
OUTPUT OUTPUT
Cerca de las 2: hs am, hay que salir antes de esas hs a m
A medida que avance cerca de la media noche 12: 04 hs am. Deben ir a las 15 hs am.
A mmm... cerca de las 12: h.s am, hay que salir antes de esas h.s. a m
A medida que avance cerca de las 12:04 hs. am. Deben ir a las 15 h.s am.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.