[英]How to match the first word in a string?
我想匹配单词'St'
或'St.'
或'st'
或'st.'
但仅作为字符串的第一个单词。 例如'St. Mary Church Church St.'
'St. Mary Church Church St.'
- 应该只找到第一个'St.'
.
'st. Mary Church Church St.'
- 应该只找到'st.'
'st Mary Church Church St.'
- 应该只找到'st'
我想最终用“Saint”替换第一次出现的地方。
正则表达式sub
允许您定义要在字符串中替换的出现次数。
即:
>>> import re
>>> s = "St. Mary Church Church St."
>>> new_s = re.sub(r'^(St.|st.|St|st)\s', r'Saint ', s, 1) # the last argument defines the number of occurrences to be replaced. In this case, it will replace the first occurrence only.
>>> new_s
'Saint Mary Church Church St.'
>>>
希望它帮助。
您不需要为此使用正则表达式,只需在字符串上使用split()
方法即可将其按空格分隔。 这将返回字符串中每个单词的列表:
matches = ["St", "St.", "st", "st."]
name = "St. Mary Church Church St."
words = name.split() #split the string into words into a list
if words [0] in matches:
words[0] = "Saint" #replace the first word in the list (St.) with Saint
new_name = "".join([word + " " for word in words]).strip() #create the new name from the words, separated by spaces and remove the last whitespace
print(new_name) #Output: "Saint Mary Church Church St."
谢谢你的提问! 这正是我要解决的问题。 我想分享我在寻找这个答案时发现的另一个正则表达式技巧。 您可以简单地将flag
参数传递给sub
。 这将使您减少传递给工具中的pattern
参数所需的信息量。 这样可以使代码更加简洁,并减少您错过模式的机会。 干杯!
import re
s = "St. Mary Church Church St."
new_s = re.sub(r'^(st.|st)\s', r'Saint ', s, 1, flags=re.IGNORECASE) # You can shorten the code from above slightly by ignoring the case
new_s
'Saint Mary Church Church St.'
import re
string = "Some text"
replace = {'St': 'Saint', 'St.': 'Saint', 'st': 'Saint', 'st.': 'Saint'}
replace = dict((re.escape(k), v) for k, v in replace.iteritems())
pattern = re.compile("|".join(replace.keys()))
for text in string.split():
text = pattern.sub(lambda m: replace[re.escape(m.group(0))], text)
我猜这应该可行,请检查。 资源
尝试使用正则表达式'^\\S+'
来匹配字符串中的第一个非空格字符。
import re
s = 'st Mary Church Church St.'
m = re.match(r'^\S+', s)
m.group() # 'st'
s = 'st. Mary Church Church St.'
m = re.match(r'^\S+', s)
m.group() # 'st.'
Python 3.10 引入了一个新的结构模式匹配功能(也称为match/case
),可以适合这个用例:
s = "St. Mary Church Church St."
words = s.split()
match words:
case ["St" | "St." | "st" | "st.", *rest]:
print("Found st at the start")
words[0] = "Saint"
case _:
print("didn't find st at the start")
print(' '.join(words))
会给:
Found st at the start
Saint Mary Church Church St.
使用s = "Mary Church Church St."
会给:
didn't find st at the start
Mary Church Church St.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.