I wanted to write a javascript function to sanitize user input and remove any unwanted and dangerous characters.
It must allow only the following characters:
My first attempt was:
function sanitizeString(str){
str = str.replace(/[^a-z0-9áéíóúñü_-\s\.,]/gim,"");
return str.trim();
}
But if I did:
sanitizeString("word1\nword2")
it returns:
"word1
word2"
So I had to rewrite the function to remove explícitly \\t\\n\\f\\r\\v\\0:
function sanitizeString(str){
str = str.replace(/([^a-z0-9áéíóúñü_-\s\.,]|[\t\n\f\r\v\0])/gim,"");
return str.trim();
}
I'd like to know:
The new version of the sanitizeString function:
function sanitizeString(str){
str = str.replace(/[^a-z0-9áéíóúñü \.,_-]/gim,"");
return str.trim();
}
The main problem was mentioned by @RobG and @Derek: (@RobG write your comment as an answer and I will accept it) \\s doesn't mean what now w3Schools says
Find a whitespace character
It means what MDN says
Matches a single white space character, including space, tab, form feed, line feed. Equivalent to [ \\f\\n\\r\\t\\v\ \ \\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ ].
I trusted in w3Schools when I wrote the function.
A second change was to move the dash character (-) to the end in order to avoid it's range separator meaning.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.