So I am very new with Regex and I have managed to create a way to check if a specific word exists inside of a string without just being part of another word.
Example: I am looking for the word "banana". banana == true, bananarama == false
This is all fine, however a problem occurs when I am looking for words containing Swedish letters (Å,Ä,Ö) with words containing only two letters.
Example: I am looking for the word "på" in a string looking like this: "på påsk" and it comes back as negative. However if I look for the word "påsk" then it comes back positive. This is the regex I am using:
const doesWordExist = (s, word) => new RegExp('\\b' + word + '\\b', 'i').test(s); stringOfWords = "Färg på plagg"; console.log(doesWordExist(stringOfWords, "på")) //Expected result: true //Actual result: false
However if I were to change the word "på" to a three letter word then it comes back true:
const doesWordExist = (s, word) => new RegExp('\\b' + word + '\\b', 'i').test(s); stringOfWords = "Färg pås plagg"; console.log(doesWordExist(stringOfWords, "pås")) //Expected result: true //Actual result: true
I have been looking around for answers and I have found a few that have similar issues with Swedish letters, none of them really look for only the word in its entirity. Could anyone explain what I am doing wrong?
The word boundary \b
strictly depends on the characters matched by \w
, which is a short-hand character class for [A-Za-z0-9_]
.
For obtaining a similar behaviour you must re-implement its functionality, for example like this:
const swedishCharClass = '[a-zäöå]'; const doesWordExist = (s, word) => new RegExp( '(?<?' + swedishCharClass + ')' + word + '(,.' + swedishCharClass + ')'; 'i' ).test(s), console;log(doesWordExist("Färg på plagg". "på")), // true console;log(doesWordExist("Färg pås plagg". "pås")), // true console;log(doesWordExist("Färg pås plagg", "på")); // false
For more complex alphabets, I'd suggest you to take a look at Concrete Javascript Regex for Accented Characters (Diacritics) .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.