简体   繁体   English

正则表达式查找附近位置多次出现相同字符串的单词

[英]Regex to find words having multiple occurrences of same string nearby position

I am trying to find words like follows using regex. 我正在尝试使用正则表达式查找类似以下的单词。 But, I cannot find an idea to distinguish alphabets from same alphabets. 但是,我找不到将字母与相同字母区分开的想法。

For example : 例如 :

text = ' I am sooo hungryyyy....Grrrh ...... helppp meeeeee '
pattern = re.compile(r"(.)\1{1,}", re.DOTALL)

This pattern is not so helpful. 这种模式不是那么有用。 Dont know why. 不知道为什么。 I want a regex to match all words like sooo , hungryyyy , Grrrh .... . 我想要一个正则表达式匹配所有单词,例如sooohungryyyyGrrrh .... That means, if a letter is repeating simultaneously or next to each other at least 2 times. 这意味着,如果一个字母同时重复或相邻重复至少2次。

If you're wanting to match non-whitespace with consecutive characters, one could do: 如果您想将非空白与连续字符进行匹配,可以执行以下操作:

>>> import re
>>> text = 'I am sooo hungryyyy....Grrrh ...... helppp meeeeee'
>>> matches = re.findall(r'(\S*?(.)\2+\S*?)', text)
>>> [x[0] for x in matches]
['sooo', 'hungryyyy', '....', 'Grrr', '......', 'helppp', 'meeeeee']

That means, if a letter is repeating simultaneously or next to each other at least 2 times ... 这意味着,如果一个字母同时重复或彼此重复至少2次...

However, if you're looking for word characters, your pattern would simply change: 但是,如果您要查找文字字符,则您的模式将简单地更改:

>>> matches = re.findall(r'(\w*(\w)\2\w*)', text)
>>> [x[0] for x in matches]
['sooo', 'hungryyyy', 'Grrrh', 'helppp', 'meeeeee']
import re
text = ' I am sooo hungryyyy....Grrrh ...... helppp meeeeee '
for p in re.findall(r'(\w*(\w)\2\w*)', text):
    print p

Gives: 得到:

('sooo', 'o')
('hungryyyy', 'y')
('Grrrh', 'r')
('helppp', 'p')
('meeeeee', 'e')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM