简体   繁体   中英

Matching more than one word

I've the following phrases and I'd like to match them:

"De la Sota: Hello" -> "De la Sota"

"Guini: Hello" -> "Guini"

"Prat Gay: Hello" -> "Prat Gay"

I'm using r"(\\w+):" but it only matches the last word before : .

Simply use this pattern:

/^(.*):/gm

Now $1 is containing what you need.

Online Demo

Noted that I'm pretty sure there is a better approach than regex for doing that. But I'm not a python expert.

str.split(":")[0] should work, where str is your string you'd like to split.

>>> str = "De la Sota: Hello" 
>>> str.split(":")[0]
'De la Sota'

This works by splitting the string into a list, where the parameter is the delimiter. If you specify the colon as the delimiter, it will split the string into a list of individual phrases separated by the colon. The [0] just refers to the first value of the list, which is what you wanted.

change \\w+ to .+ or .*:

input_text = ''' De la Sota: Hello

Guini: Hello

Prat Gay: Hello'''

print(re.findall(r'(.+):',input_text)
"Prat Gay: Hello" -> "Prat Gay"

If that is exactly what you have you can use a negation set to get rid of : , (using \\s -- or if it is a tab using \\t ) and Helo because it is a set. As for names, some last name contain - , or we need more than one occurrence of a character ( \\w ) to make a name:

import re
string = ''' De la Sota: Hello

Guini: Hello

Prat Gay: Hello
'''
print(re.findall(r'[-\w ]+[^:\sHelo]', string))

gives the following answer:

[' De la Sota', 'Guini', 'Prat Gay']

You should use re.findall not re.match because the former looks in the entire string and the latter only matches with the first line and see if the string starts with it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM