简体   繁体   中英

PHP Regex detect repeated character in a word

(preg_match('/(.)\1{3}/', $repeater))

I am trying to create a regular expression which will detect a word that repeats a character 3 or more times throughout the word. I have tried this numerous ways and I can't seem to get the correct output.

If you don't need letters to be contiguous, you can do it with this pattern:

\b\w*?(\w)\w*?\1\w*?\1\w*

otherwise this one should suffice:

\b\w*?(\w)\1{2}\w*

Try this regex instead

(preg_match('/(.)\1{2,}/', $repeater))

This should match 3 or more times, see example here http://regexr.com/3fk80

Strictly speaking, regular expressions that include \\1 , \\2 , ... things are not mathematical regular expressions and the scanner that parses them is not efficient in the sense that it has to modify itself to include the accepted group, in order to be used to match the discovered string, and in case of failure it has to backtrack for the length of the matched group.

The canonical way to express a true regular expression that accepts word characters repeated three or more times is

(A{3,}|B{3,}|C{3,}|...|Z{3,}|a{3,}|b{3,}|...|z{3,})

and there's no associativity of the operator {3,} to be able to group it as you shown in your question.

For the pedantic, the pure regular expression should be:

(AAAA*|BBBB*|CCCC*|...|ZZZZ*|aaaa*|bbbb*|cccc*|...|zzzz*)

again, this time, you can use the fact that AAAA* is matched as soon as three A s are found, so it would be valid also the regex:

AAA|BBB|CCC|...|ZZZ|aaa|bbb|ccc|...|zzz

but the first version allow you to capture the \\1 group that delimits the actual matching sequence.

This approach will be longer to write but is by far much more efficient when parsing the data string, as it has no backtrack at all and visits each character only once.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM