Regular expression: match a word of certain length which starts with certain letters

Question

I need a regex which matches a 7 letter word, which starts with 'st' . For example, it should only match 'startin' out of the following: start startin starting

Answer 1

General tips:

The starting symbols are included into the regex directly, eg st . If the starting characters are special in the sense of regex-syntax (like dots, parentheses, etc.), you need to escape them with a backslash, but it is not needed in your case.
After the starting symbols, include character class for the remaining characters of your "word". If you want to allow all characters, use a dot: . . If you want to allow all non-whitespace characters, use \\S . If you want to allow only (unicode) letters, use \\p{L} . To only allow non-accented latin letters, use [A-Za-z] . There are many possibilities here.
Finally, include repetition quantifier for the character class from the previous step. In you case, you need exactly 5 characters after st , so the repetition quantifier is {5} .
If you want only the whole string to match, use \\A at the beginning and \\z at the end of your regex. Or include \\b at the beginning/end of your regex to match at the so-called word boundaries (including start/end of the string, whitespace, punctuation). The most powerful alternative (with full control) is the so-called lookahead - I'll leave it out here for the sake of simplicity.

See this tutorial for details. You can just look for specific keywords I've mentioned, eg repetition , character class , unicode , lookahead , etc.

Answer 2

To match words with non-accent characters that are case insensitive you'll need the i modifier or you'll need to declare both letters at the beginning in both cases.

<?php

    $regex = '!\bst[a-z]{5}\b!i';
    $words = "start startin starting station Stalker SHOWER Staples Stiffle Steerin StÄbles'";
    preg_match_all($regex,$words,$matches);
    print_r($matches[0]);
?>

Output

Array
(
    [0] => startin
    [1] => station
    [2] => Stalker
    [3] => Staples
    [4] => Stiffle
    [5] => Steerin
)

With the same output as above, if you didn't use the i modifier you would have to declare more characters:

$regex = '!\b[Ss][Tt][A-Za-z]{5}\b!';

If you want to match Unicode Characters you can do this:

print "<meta charset=\"utf-8\"><body>";

    $regex = '!\bst([a-z]|[^u0000-u0080]){5}\b!iu';

    $words = "start startin starting station Stalker SHOWER Staples Stiffle Steerin StÄbles'";

    preg_match_all($regex,$words,$matches);

    print_r($matches[0]);

print "</body>";

Output

    Array
(
    [0] => startin
    [1] => station
    [2] => Stalker
    [3] => Staples
    [4] => Stiffle
    [5] => Steerin
    [6] => StÄbles //without UTF-8 output it looks like this-> StÃ"bles
)

Answer 3

preg_match_all('/\bst\w{5}\b/', 'start startin starting', $arr, PREG_PATTERN_ORDER);

更新：根据评论使用前后字边界

Regular expression: match a word of certain length which starts with certain letters

Question

3 answers

solution1
5 2013-05-06 20:48:43

solution2
1 2013-05-06 21:22:57

solution3
0 2013-05-06 20:26:20

Regular expression: match a word of certain length which starts with certain letters

Question

3 answers

solution1 5 2013-05-06 20:48:43

solution2 1 2013-05-06 21:22:57

solution3 0 2013-05-06 20:26:20

solution1
5 2013-05-06 20:48:43

solution2
1 2013-05-06 21:22:57

solution3
0 2013-05-06 20:26:20