I am trying to find words like follows using regex. But, I cannot find an idea to distinguish alphabets from same alphabets.
For example :
text = ' I am sooo hungryyyy....Grrrh ...... helppp meeeeee '
pattern = re.compile(r"(.)\1{1,}", re.DOTALL)
This pattern is not so helpful. Dont know why. I want a regex to match all words like sooo
, hungryyyy
, Grrrh
....
. That means, if a letter is repeating simultaneously or next to each other at least 2 times.
If you're wanting to match non-whitespace with consecutive characters, one could do:
>>> import re
>>> text = 'I am sooo hungryyyy....Grrrh ...... helppp meeeeee'
>>> matches = re.findall(r'(\S*?(.)\2+\S*?)', text)
>>> [x[0] for x in matches]
['sooo', 'hungryyyy', '....', 'Grrr', '......', 'helppp', 'meeeeee']
That means, if a letter is repeating simultaneously or next to each other at least 2 times ...
However, if you're looking for word characters, your pattern would simply change:
>>> matches = re.findall(r'(\w*(\w)\2\w*)', text)
>>> [x[0] for x in matches]
['sooo', 'hungryyyy', 'Grrrh', 'helppp', 'meeeeee']
import re
text = ' I am sooo hungryyyy....Grrrh ...... helppp meeeeee '
for p in re.findall(r'(\w*(\w)\2\w*)', text):
print p
Gives:
('sooo', 'o')
('hungryyyy', 'y')
('Grrrh', 'r')
('helppp', 'p')
('meeeeee', 'e')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.