[英]Regex match for beginning of multiple words in string
In Javascript i want to be able to match strings that begin with a certain phrase. 在Javascript中,我希望能够匹配以某个短语开头的字符串。 However, I want it to be able to match the start of any word in the phrase, not just the beginning of the phrase.
但是,我希望它能够匹配短语中任何单词的开头,而不仅仅是短语的开头。
For example: 例如:
Phrase: "This is the best" 短语:“这是最好的”
Need to Match: "th" 需要匹配:“th”
Result: Matches Th and th 结果:匹配Th和th
EDIT: \\b works great however it proposes another issue: 编辑:\\ b工作得很好,但它提出了另一个问题:
It will also match characters after foreign ones. 在外国人之后它也会匹配字符。 For example if my string is "Männ", and i search for "n", it will match the n after Mä...Any ideas?
例如,如果我的字符串是“Männ”,并且我搜索“n”,它将匹配M之后的n ...任何想法?
"This is the best moth".match(/\bth/gi);
or with a variable for your string 或者使用字符串的变量
var string = "This is the best moth";
alert(string.match(/\bth/gi));
\\b
in a regex is a word boundary so \\bth
will only match a th
that at the beginning of a word. \\b
在一个正则表达式就是一个字边界, \\bth
将只匹配一个th
在单词的开头说。
gi
is for a global match (look for all occurrences) and case insensitive gi
用于全局匹配(查找所有出现的内容)和不区分大小写
(I threw moth
in there to as a reminder to check that it is not matched) (我把
moth
扔在那里作为提醒,检查它是不匹配的)
Edit: 编辑:
So, the above only returns the part that you match ( th
). 所以,上面只返回你匹配的部分(
th
)。 If you want to return the entire words, you have to match the entire word. 如果要返回整个单词,则必须匹配整个单词。
This is where things get tricky fast. 这是事情变得棘手的地方。 First with no HTML entity letter:
首先没有HTML实体字母:
string.match(/\bth[^\b]*?\b/gi);
To match the entire word go from the word boundary \\b
grab the th
followed by non word boundaries [^\\b]
until you get to another word boundary \\b
. 要匹配整个单词,请从单词边界
\\b
抓取th
后跟非单词边界[^\\b]
直到找到另一个单词边界\\b
。 The *
means you want to look for 0 or more of the previous (non word boundaries) the ?
*
表示你想要查找前面的0个或多个(非单词边界) ?
mark means that this is a lazy match. mark表示这是一个懒惰的匹配。 In other words it doesn't expand to as big as would be possible, but stops at the first opportunity.
换句话说,它不会扩大到尽可能大,但在第一次机会时停止。
If you have HTML entity characters like ä ( ä
) things get complicated really fast, and you have to use whitespace or whitespace and a set of defined characters that may be at word boundaries. 如果你有像ä(
ä
)这样的HTML实体字符ä
事情变得非常复杂,你必须使用空格或空格以及一组可能在字边界处定义的字符。
string.match(/\sth[^\s]*|^th[^\s]*/gi);
Example with HTML entities. HTML实体的示例。
Since we're not using word boundaries, we have to take care of the beginning of the string separately ( |^
). 由于我们没有使用单词边界,我们必须单独处理字符串的开头(
|^
)。
The above will capture the white space at the beginning of words. 以上将捕获单词开头的空白区域。 Using
\\b
will not capture white space, since \\b
has no width. 使用
\\b
不会捕获空格,因为\\b
没有宽度。
Use this: 用这个:
string.match(/^th|\sth/gi);
Examples: 例子:
'is this is a string'.match(/^th|\sth/gi);
'the string: This is a string'.match(/^th|\sth/gi);
Results: 结果:
["th", " Th"]
[“th”,“Th”]
["th"]
[ “TH”]
var matches = "This is the best".match(/\bth/ig);
returns: 收益:
["Th", "th"]
The regular expression means: Match "th" ignoring case and globally (meaning, don't stop at just one match) if "th" is the first word in the string or if "th" is preceded by a space character. 正则表达式意味着:如果“th”是字符串中的第一个单词或者如果“th”前面有空格字符,则匹配“th”忽略大小写和全局(意思是,不要仅停留在一个匹配项)。
Use the g
flag in the regex. 在正则表达式中使用
g
标志。 It stands for "global", I think, and it searches for all matches instead of only the first one. 我认为它代表“全球”,它会搜索所有匹配而不是第一个匹配。
You should also use the i
flag for case-insensitive matching. 您还应该使用
i
标志进行不区分大小写的匹配。
You add flags to the end of the regex ( /<regex>/<flags>
) or as a second parameter to new RegExp(pattern, flags)
您将标志添加到正则表达式的末尾(
/<regex>/<flags>
)或作为new RegExp(pattern, flags)
的第二个参数new RegExp(pattern, flags)
For instance: 例如:
var matches = "This is the best".match(/\bth/gi);
or, using RegExp
objects: 或者,使用
RegExp
对象:
var re = new RegExp("\\bth", "gi");
var matches = re.exec("This is the best");
EDIT: Use \\b
in the regex to match the b oundary of a word.
编辑:使用
\\b
的正则表达式匹配一个字中的B oundary。 Note that it does not really match any specific character, but the beginning or end of a word or the string. 请注意,它并不真正匹配任何特定字符,而是字或字符串的开头或结尾。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.