如何匹配字符串中的第一个单词？

Question

I want to match the word 'St' or 'St.'我想匹配单词'St'或'St.' or 'st' or 'st.'或'st'或'st.' BUT only as the first word of a string.但仅作为字符串的第一个单词。 For example 'St. Mary Church Church St.'例如'St. Mary Church Church St.' 'St. Mary Church Church St.' - should find ONLY first 'St.' - 应该只找到第一个'St.' . .

'st. Mary Church Church St.' - should find ONLY 'st.' - 应该只找到'st.'
'st Mary Church Church St.' - should find ONLY 'st' - 应该只找到'st'

I want to eventually replace the first occurrence with 'Saint'.我想最终用“Saint”替换第一次出现的地方。

Answer 1

Regex sub allows you to define the number of occurence to replace in a string. 正则表达式sub允许您定义要在字符串中替换的出现次数。

ie : 即：

>>> import re
>>> s = "St. Mary Church Church St."
>>> new_s = re.sub(r'^(St.|st.|St|st)\s', r'Saint ', s, 1) # the last argument defines the number of occurrences to be replaced. In this case, it will replace the first occurrence only.
>>> new_s
'Saint Mary Church Church St.'
>>>

Hope it hepls. 希望它帮助。

Answer 2

You don't need to use a regex for this, just use the split() method on your string to split it by whitespace. 您不需要为此使用正则表达式，只需在字符串上使用split()方法即可将其按空格分隔。 This will return a list of every word in your string: 这将返回字符串中每个单词的列表：

matches = ["St", "St.", "st", "st."]
name = "St. Mary Church Church St."
words = name.split()   #split the string into words into a list
if words [0] in matches:
    words[0] = "Saint"   #replace the first word in the list (St.) with Saint
new_name = "".join([word + " " for word in words]).strip()   #create the new name from the words, separated by spaces and remove the last whitespace
print(new_name)   #Output: "Saint Mary Church Church St."

Answer 3

Thanks for the question! 谢谢你的提问！ This is exactly what I was looking for to solve my issue. 这正是我要解决的问题。 I wanted to share another regex trick I found while hunting around for this answer. 我想分享我在寻找这个答案时发现的另一个正则表达式技巧。 You can simply pass the flag paramater into the sub function. 您可以简单地将flag参数传递给sub 。 This will allow you to reduce the amount of information you need to pass to the pattern paramater in the tool. 这将使您减少传递给工具中的pattern参数所需的信息量。 This makes the code a little cleaner and reduces the chances of you missing a pattern. 这样可以使代码更加简洁，并减少您错过模式的机会。 Cheers! 干杯!

import re
s = "St. Mary Church Church St."
new_s = re.sub(r'^(st.|st)\s', r'Saint ', s, 1, flags=re.IGNORECASE) # You can shorten the code from above slightly by ignoring the case
new_s
'Saint Mary Church Church St.'

Answer 4

import re

string = "Some text"

replace = {'St': 'Saint', 'St.': 'Saint', 'st': 'Saint', 'st.': 'Saint'}
replace = dict((re.escape(k), v) for k, v in replace.iteritems())
pattern = re.compile("|".join(replace.keys()))
for text in string.split():
    text = pattern.sub(lambda m: replace[re.escape(m.group(0))], text)

This should work I guess, please check. 我猜这应该可行，请检查。 Source 资源

Answer 5

Try using the regex '^\\S+' to match the first non-space character in your string. 尝试使用正则表达式'^\\S+'来匹配字符串中的第一个非空格字符。

import re 

s = 'st Mary Church Church St.'
m = re.match(r'^\S+', s)
m.group()    # 'st'

s = 'st. Mary Church Church St.'
m = re.match(r'^\S+', s)
m.group()    # 'st.'

Answer 6

Python 3.10 introduced a new Structural Pattern Matching feature (otherwise known as match/case ) which can fit this use-case: Python 3.10 引入了一个新的结构模式匹配功能（也称为match/case ），可以适合这个用例：

s = "St. Mary Church Church St."

words = s.split()
match words:
    case ["St" | "St." | "st" | "st.", *rest]:
        print("Found st at the start")
        words[0] = "Saint"
    case _:
        print("didn't find st at the start")

print(' '.join(words))

Will give:会给：

Found st at the start
Saint Mary Church Church St.

While using s = "Mary Church Church St."使用s = "Mary Church Church St." will give:会给：

didn't find st at the start
Mary Church Church St.

如何匹配字符串中的第一个单词？

问题描述

6 个解决方案

解决方案1
3 2016-08-28 16:21:59

解决方案2
2 2016-08-28 16:11:05

解决方案3
1 2019-06-07 20:19:40

解决方案4
0 2016-08-28 16:11:27

解决方案5
0 2016-08-28 16:31:26

解决方案6
0 2023-01-30 10:01:56

如何匹配字符串中的第一个单词？

问题描述

6 个解决方案

解决方案1 3 2016-08-28 16:21:59

解决方案2 2 2016-08-28 16:11:05

解决方案3 1 2019-06-07 20:19:40

解决方案4 0 2016-08-28 16:11:27

解决方案5 0 2016-08-28 16:31:26

解决方案6 0 2023-01-30 10:01:56

解决方案1
3 2016-08-28 16:21:59

解决方案2
2 2016-08-28 16:11:05

解决方案3
1 2019-06-07 20:19:40

解决方案4
0 2016-08-28 16:11:27

解决方案5
0 2016-08-28 16:31:26

解决方案6
0 2023-01-30 10:01:56