In Javascript i want to be able to match strings that begin with a certain phrase. However, I want it to be able to match the start of any word in the phrase, not just the beginning of the phrase.
For example:
Phrase: "This is the best"
Need to Match: "th"
Result: Matches Th and th
EDIT: \\b works great however it proposes another issue:
It will also match characters after foreign ones. For example if my string is "Männ", and i search for "n", it will match the n after Mä...Any ideas?
"This is the best moth".match(/\bth/gi);
or with a variable for your string
var string = "This is the best moth";
alert(string.match(/\bth/gi));
\\b
in a regex is a word boundary so \\bth
will only match a th
that at the beginning of a word.
gi
is for a global match (look for all occurrences) and case insensitive
(I threw moth
in there to as a reminder to check that it is not matched)
Edit:
So, the above only returns the part that you match ( th
). If you want to return the entire words, you have to match the entire word.
This is where things get tricky fast. First with no HTML entity letter:
string.match(/\bth[^\b]*?\b/gi);
To match the entire word go from the word boundary \\b
grab the th
followed by non word boundaries [^\\b]
until you get to another word boundary \\b
. The *
means you want to look for 0 or more of the previous (non word boundaries) the ?
mark means that this is a lazy match. In other words it doesn't expand to as big as would be possible, but stops at the first opportunity.
If you have HTML entity characters like ä ( ä
) things get complicated really fast, and you have to use whitespace or whitespace and a set of defined characters that may be at word boundaries.
string.match(/\sth[^\s]*|^th[^\s]*/gi);
Since we're not using word boundaries, we have to take care of the beginning of the string separately ( |^
).
The above will capture the white space at the beginning of words. Using \\b
will not capture white space, since \\b
has no width.
Use this:
string.match(/^th|\sth/gi);
Examples:
'is this is a string'.match(/^th|\sth/gi);
'the string: This is a string'.match(/^th|\sth/gi);
Results:
["th", " Th"]
["th"]
var matches = "This is the best".match(/\bth/ig);
returns:
["Th", "th"]
The regular expression means: Match "th" ignoring case and globally (meaning, don't stop at just one match) if "th" is the first word in the string or if "th" is preceded by a space character.
Use the g
flag in the regex. It stands for "global", I think, and it searches for all matches instead of only the first one.
You should also use the i
flag for case-insensitive matching.
You add flags to the end of the regex ( /<regex>/<flags>
) or as a second parameter to new RegExp(pattern, flags)
For instance:
var matches = "This is the best".match(/\bth/gi);
or, using RegExp
objects:
var re = new RegExp("\\bth", "gi");
var matches = re.exec("This is the best");
EDIT: Use \\b
in the regex to match the oundary of a word. oundary。 Note that it does not really match any specific character, but the beginning or end of a word or the string.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.