简体   繁体   中英

find whether the string starts and ends with the same word

I am trying to check whether the string starts and ends with the same word. eg earth .

s=raw_input();
m=re.search(r"^(earth).*(earth)$",s)
if m is not None:
    print "found"

my problem is when the string consists only of one word eg: earth

At present I have hard coded this case by

if m is not None or s=='earth':
    print "found"

Is there any other way to do this?

EDIT:

words in a string are separated by spaces. looking for a regex solution

some examples :

"earth is earth" ,"earth", --> valid

"earthearth", "eartheeearth", "earth earth mars" --> invalid

Use the str.startswith and str.endswith methods instead.

>>> 'earth'.startswith('earth')
True
>>> 'earth'.endswith('earth')
True

You can simply combine them into a single function:

def startsandendswith(main_str):
    return main_str.startswith(check_str) and main_str.endswith(check_str)

And now we can call it:

>>> startsandendswith('earth', 'earth')
True

If, however, if the code matches words and not part of a word, it might be simpler to split the string, and then check if the first and last word are the string you want to check for:

def startsandendswith(main_str, check_str):
    if not main_str:  # guard against empty strings
        return False
    words = main_str.split(' ')  # use main_str.split() to split on any whitespace
    return words[0] == words[-1] == check_str

Running it:

>>> startsandendswith('earth', 'earth')
True
>>> startsandendswith('earth is earth', 'earth')
True
>>> startsandendswith('earthis earth', 'earth')
False

You can use backreference within regex

^(\w+\b)(.*\b\1$|$)

This would match a string only if it

  • starts and ends with the same word
  • has a single word

You can use str.startswith and str.endswith :

>>> strs = "earthfooearth"
>>> strs.startswith('earth') and strs.endswith("earth")
True
>>> strs = "earth"
>>> strs.startswith('earth') and strs.endswith("earth")
True

Update:

If the words are separated by spaces and the start and end string is not known then use str.split and str.rsplit :

>>> strs = "foo bar foo"
>>> strs.split(None, 1)[0] == strs.rsplit(None, 1)[-1]
True
# single word
>>> strs = "foo"
>>> strs.split(None, 1)[0] == strs.rsplit(None, 1)[-1]
True
>>> strs = "foo bar ffoo"
>>> strs.split(None, 1)[0] == strs.rsplit(None, 1)[-1]
False

Here:

X = words.split()
X[:1] == X[-1:]

The slicing makes it work for empty strings too, and extend nicely to any number of words. If words cannot be empty, use

X[0] == X[-1]

Well, if you absolutely want regex, you can make use of lookarounds, since they don't consume characters.

>>>import re
>>>s1 = 'earth is earth'
>>>s2 = 'earth'
>>>m = re.search(r"^(?=(earth)).*(earth)$",s1)
>>>m.group(1)
'earth'
>>>m.group(2)
'earth'
>>>m = re.search(r"^(?=(earth)).*(earth)$",s2)
>>>m.group(1)
'earth'
>>>m.group(2)
'earth'

For any string, you could perhaps use this:

^(?=([A-Za-z]+)).*(\1)$

I'm assuming words as being only alphabet characters. If you mean words as in non-space characters, then you may go for \\S instead of [A-Za-z] .

EDIT: Okay, it seems there's more to it. What I think might suit is:

^(?=(earth\b)).*((?:^|\s)\1)$

For the work earth. For any word stored in a variable named word ;

>>> word = 'earth' # Makes it so you can change it anytime
>>> pattern = re.compile('^(?=(' + word + '\b)).*((?:^|\s)\1)$')
>>> m.search(pattern, s)

Accepts:

earth is earth
earth

Rejects:

earthearth
eartheearth
earthis earth

And after that extract the captured groups or check whether the group are empty or not.

The bit I added is (?:^|\\s) which checks for whether the word you're looking for is the only one in the 'sentence' or whether the word is in a sentence.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM