正则表达式匹配字符串中多个单词的开头

Question

In Javascript i want to be able to match strings that begin with a certain phrase. 在Javascript中，我希望能够匹配以某个短语开头的字符串。 However, I want it to be able to match the start of any word in the phrase, not just the beginning of the phrase. 但是，我希望它能够匹配短语中任何单词的开头，而不仅仅是短语的开头。

For example: 例如：

Phrase: "This is the best" 短语：“这是最好的”

Need to Match: "th" 需要匹配：“th”

Result: Matches Th and th 结果：匹配Th和th

EDIT: \\b works great however it proposes another issue: 编辑：\\ b工作得很好，但它提出了另一个问题：

It will also match characters after foreign ones. 在外国人之后它也会匹配字符。 For example if my string is "Männ", and i search for "n", it will match the n after Mä...Any ideas? 例如，如果我的字符串是“Männ”，并且我搜索“n”，它将匹配M之后的n ...任何想法？

Answer 1

"This is the best moth".match(/\bth/gi);

or with a variable for your string 或者使用字符串的变量

var string = "This is the best moth";
alert(string.match(/\bth/gi));

\\b in a regex is a word boundary so \\bth will only match a th that at the beginning of a word. \\b在一个正则表达式就是一个字边界， \\bth将只匹配一个th在单词的开头说。

gi is for a global match (look for all occurrences) and case insensitive gi用于全局匹配（查找所有出现的内容）和不区分大小写

(I threw moth in there to as a reminder to check that it is not matched) （我把moth扔在那里作为提醒，检查它是不匹配的）

jsFiddle example jsFiddle例子

Edit: 编辑：

So, the above only returns the part that you match ( th ). 所以，上面只返回你匹配的部分（ th ）。 If you want to return the entire words, you have to match the entire word. 如果要返回整个单词，则必须匹配整个单词。

This is where things get tricky fast. 这是事情变得棘手的地方。 First with no HTML entity letter: 首先没有HTML实体字母：

string.match(/\bth[^\b]*?\b/gi);

Example 例

To match the entire word go from the word boundary \\b grab the th followed by non word boundaries [^\\b] until you get to another word boundary \\b . 要匹配整个单词，请从单词边界\\b抓取th后跟非单词边界[^\\b]直到找到另一个单词边界\\b 。 The * means you want to look for 0 or more of the previous (non word boundaries) the ? *表示你想要查找前面的0个或多个（非单词边界） ? mark means that this is a lazy match. mark表示这是一个懒惰的匹配。 In other words it doesn't expand to as big as would be possible, but stops at the first opportunity. 换句话说，它不会扩大到尽可能大，但在第一次机会时停止。

If you have HTML entity characters like ä ( ä ) things get complicated really fast, and you have to use whitespace or whitespace and a set of defined characters that may be at word boundaries. 如果你有像ä（ ä ）这样的HTML实体字符ä事情变得非常复杂，你必须使用空格或空格以及一组可能在字边界处定义的字符。

string.match(/\sth[^\s]*|^th[^\s]*/gi);

Example with HTML entities. HTML实体的示例。

Since we're not using word boundaries, we have to take care of the beginning of the string separately ( |^ ). 由于我们没有使用单词边界，我们必须单独处理字符串的开头（ |^ ）。

The above will capture the white space at the beginning of words. 以上将捕获单词开头的空白区域。 Using \\b will not capture white space, since \\b has no width. 使用\\b不会捕获空格，因为\\b没有宽度。

Answer 2

Use this: 用这个：

string.match(/^th|\sth/gi);

Examples: 例子：

'is this is a string'.match(/^th|\sth/gi);


'the string: This is a string'.match(/^th|\sth/gi);

Results: 结果：

["th", " Th"] [“th”，“Th”]

["th"] [ “TH”]

Answer 3

var matches = "This is the best".match(/\bth/ig);

returns: 收益：

["Th", "th"]

The regular expression means: Match "th" ignoring case and globally (meaning, don't stop at just one match) if "th" is the first word in the string or if "th" is preceded by a space character. 正则表达式意味着：如果“th”是字符串中的第一个单词或者如果“th”前面有空格字符，则匹配“th”忽略大小写和全局（意思是，不要仅停留在一个匹配项）。

Answer 4

Use the g flag in the regex. 在正则表达式中使用g标志。 It stands for "global", I think, and it searches for all matches instead of only the first one. 我认为它代表“全球”，它会搜索所有匹配而不是第一个匹配。

You should also use the i flag for case-insensitive matching. 您还应该使用i标志进行不区分大小写的匹配。

You add flags to the end of the regex ( /<regex>/<flags> ) or as a second parameter to new RegExp(pattern, flags) 您将标志添加到正则表达式的末尾（ /<regex>/<flags> ）或作为new RegExp(pattern, flags)的第二个参数new RegExp(pattern, flags)

For instance: 例如：

var matches = "This is the best".match(/\bth/gi);

or, using RegExp objects: 或者，使用RegExp对象：

var re = new RegExp("\\bth", "gi");
var matches = re.exec("This is the best");

EDIT: Use \\b in the regex to match the b oundary of a word. 编辑：使用\\b的正则表达式匹配一个字中的B oundary。 Note that it does not really match any specific character, but the beginning or end of a word or the string. 请注意，它并不真正匹配任何特定字符，而是字或字符串的开头或结尾。

正则表达式匹配字符串中多个单词的开头

问题描述

4 个解决方案

解决方案1
23 已采纳 2010-08-17 22:31:18

解决方案2
1 2010-08-17 22:24:08

解决方案3
1 2010-08-17 22:25:28

解决方案4
1 2010-08-17 22:30:12

正则表达式匹配字符串中多个单词的开头

问题描述

4 个解决方案

解决方案1 23 已采纳 2010-08-17 22:31:18

解决方案2 1 2010-08-17 22:24:08

解决方案3 1 2010-08-17 22:25:28

解决方案4 1 2010-08-17 22:30:12

解决方案1
23 已采纳 2010-08-17 22:31:18

解决方案2
1 2010-08-17 22:24:08

解决方案3
1 2010-08-17 22:25:28

解决方案4
1 2010-08-17 22:30:12