简体   繁体   English

设置正则表达式模式,如果所述模式在字符串中得到验证,它的一部分被另一个 substring 替换

[英]Set regex pattern that if said pattern is validated in a string, a part of it is replaced by another substring

How to fix this regex so that with these input strings I get these outputs...如何修复这个正则表达式,以便使用这些输入字符串我得到这些输出......

out = re.sub(r"(hs|h.s|h.s.)a m(\W|\b)", r"\1 am\2", out)
print(repr(out))

Input string examples...输入字符串示例...

#example 1.1
colloquial_hour = "Cerca de las 2: hs a m, hay que salir antes de esas hs a m"
#example 1.2
colloquial_hour = "A medida que avance cerca de la media noche 12: 04 hs a m. Deben ir a las 15 hs a m."
#example 1.3
colloquial_hour = "A mmm... cerca de las 12: h.s a m, hay que salir antes de esas h.s. a m"
#example 1.4
colloquial_hour = "A medida que avance cerca de las 12:04 hs. a m. Deben ir a las 15 h.s a m."

correct outputs:正确的输出:

#correct output for example 1.1
"Cerca de las 2: hs am, hay que salir antes de esas hs a m"
#correct output for example 1.2
"A medida que avance cerca de la media noche 12: 04 hs am. Deben ir a las 15 hs am."
#correct output for example 1.3
"A mmm... cerca de las 12: h.s am, hay que salir antes de esas h.s. a m"
#correct output for example 1.4
"A medida que avance cerca de las 12:04 hs. am. Deben ir a las 15 h.s am."

The logic should work that su will do a numeric value and then an "am" replace that "am" substring with this string "am" in the original string.逻辑应该起作用,su 将执行一个数值,然后一个"am"用原始字符串中的这个字符串"am" ”替换那个"am" substring。

These would be all the possible cases where you have to replace the substring "am" with "am"这些将是您必须将 substring “am”替换为“am”的所有可能情况

X a m
X: a m
X: hs a m
X: h.s. a m
X: h.s a m
X: hs. a m
X:  a m
X : hs a m
X  : h.s. a m
X : h.s a m
X  : hs. a m
X hs a m
X h.s. a m
X h.s a m
X hs. a m

#where "X" is a numerical value ("1", "2", "3", "4", "5", "6", ... )
#in all these cases, in which this pattern is met, "a m" must be replaced by "am"

You could match:你可以匹配:

(\d+\s*:?\s*(?:h\.?s\.?)?)\s*a m\b

The pattern matches:模式匹配:

  • ( Capture group 1 (捕获组 1
    • \d+\s*:?\s* match 1+ digits and an optional : between optional whitespace chars \d+\s*:?\s*匹配 1+ 位数字和可选的:可选空白字符之间
    • (?:h\.?s\.?)? Optionally match hm hs hs.可选匹配hm hs hs. hs
  • ) Close group 1 )关闭第 1 组
  • \s*am\b Match optional whitespace chars and am \s*am\b匹配可选的空白字符和am

And replace with group 1 followed by am并替换为第 1 组,然后是am

\1 am

See a regex demo and a Python demo查看正则表达式演示Python 演示

You can search using regex:您可以使用正则表达式进行搜索:

(\d\W+)(h\.?s\.?\s+)?a\s+m\b

and replace using:并替换使用:

\1\2am

RegEx Demo正则表达式演示

RegEx Details:正则表达式详细信息:

  • (\d\W+) : Match a digit followed by 1+ non-word char in capture group #1 (\d\W+) :在捕获组 #1 中匹配一个数字后跟 1+ 个非单词字符
  • (h\.?s\.?\s+)? : Match h followed by s with optional dots after them. : 匹配h后跟s ,后面有可选的点。 This optional group is capture group #2可选组是捕获组 #2
  • a\s+m\b : Match a followed by 1+ whitespaces then m with a word boundary a\s+m\b :匹配a后跟 1+ 个空格,然后匹配m与单词边界

My solution uses re.sub我的解决方案使用re.sub

import re

phrases = ["Cerca de las 2: hs a m, hay que salir antes de esas hs a m",
"A medida que avance cerca de la media noche 12: 04 hs a m. Deben ir a las 15 hs a m.",
"A mmm... cerca de las 12: h.s a m, hay que salir antes de esas h.s. a m",
"A medida que avance cerca de las 12:04 hs. a m. Deben ir a las 15 h.s a m."]

pattern = re.compile(r'\d\s*?:?\s*?h?\.?s?\.?\s(a m)')

for phrase in phrases:
    print(pattern.sub(lambda x: x.group(0)[:-3] + "am", phrase))

OUTPUT OUTPUT

Cerca de las 2: hs am, hay que salir antes de esas hs a m
A medida que avance cerca de la media noche 12: 04 hs am. Deben ir a las 15 hs am.
A mmm... cerca de las 12: h.s am, hay que salir antes de esas h.s. a m
A medida que avance cerca de las 12:04 hs. am. Deben ir a las 15 h.s am.

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM