简体   繁体   中英

Regex to find words having multiple occurrences of same string nearby position

I am trying to find words like follows using regex. But, I cannot find an idea to distinguish alphabets from same alphabets.

For example :

text = ' I am sooo hungryyyy....Grrrh ...... helppp meeeeee '
pattern = re.compile(r"(.)\1{1,}", re.DOTALL)

This pattern is not so helpful. Dont know why. I want a regex to match all words like sooo , hungryyyy , Grrrh .... . That means, if a letter is repeating simultaneously or next to each other at least 2 times.

If you're wanting to match non-whitespace with consecutive characters, one could do:

>>> import re
>>> text = 'I am sooo hungryyyy....Grrrh ...... helppp meeeeee'
>>> matches = re.findall(r'(\S*?(.)\2+\S*?)', text)
>>> [x[0] for x in matches]
['sooo', 'hungryyyy', '....', 'Grrr', '......', 'helppp', 'meeeeee']

That means, if a letter is repeating simultaneously or next to each other at least 2 times ...

However, if you're looking for word characters, your pattern would simply change:

>>> matches = re.findall(r'(\w*(\w)\2\w*)', text)
>>> [x[0] for x in matches]
['sooo', 'hungryyyy', 'Grrrh', 'helppp', 'meeeeee']
import re
text = ' I am sooo hungryyyy....Grrrh ...... helppp meeeeee '
for p in re.findall(r'(\w*(\w)\2\w*)', text):
    print p

Gives:

('sooo', 'o')
('hungryyyy', 'y')
('Grrrh', 'r')
('helppp', 'p')
('meeeeee', 'e')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM