简体   繁体   中英

Remove string characters from a string if not matched in an array

I am trying to loop over an array which contains strings that I want my input string to compare with. So approach is I am looping over an input string to check if each character matches with one of the elements present in the array. If not just replace that character with just ''. Note: regular expression is really not an option.

Here is how my JavaScript looks like

var input = 'this is A [{}].-_+~`:; *6^123@#$%&*()?{}|\ ';
input.toLowerCase(input)

var allowed = ['0','1','2','3','4','5','6','7','8','9','a','b','c','d', 'e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','s','à','â','ä','è','é','ê','ë','î','ï','ô','œ','ù','û','ü','ÿ','ç','À','Â','Ä','È','É','Ê','Ë','Î','Ï','Ô','Œ','Ù','Û','Ü','Ÿ','Ç', ' '] 

var cleanStr = '';
for(var i = 0; i < input.length; i++){
    for(var j = 0; j< allowed.length; j++){
    if(input[i] !== allowed[j]){
        cleanStr = input.replace(input[i], ' ');
      console.log(cleanStr);
    }
  }
}

The console log output doesn't appear to be any different than the input field. What am I missing?

Here is my fiddle

https://jsfiddle.net/sghoush1/nvnk7r9j/4/

You can do this in a single loop.

 var input = 'this is A [{}].-_+~`:; *6^123@#$%&*()?{}|\\ '; input = input.toLowerCase(); // Note the syntax here var allowed = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'à', 'â', 'ä', 'è', 'é', 'ê', 'ë', 'î', 'ï', 'ô', 'œ', 'ù', 'û', 'ü', 'ÿ', 'ç', 'À', 'Â', 'Ä', 'È', 'É', 'Ê', 'Ë', 'Î', 'Ï', 'Ô', 'Œ', 'Ù', 'Û', 'Ü', 'Ÿ', 'Ç', ' ']; var cleanStr = ''; // Loop over each character in the string for (var i = 0; i<input.length; i++) { // Check if the character is allowed or not if (allowed.indexOf(input[i]) !== -1) { // Concat the allowed character to result string cleanStr += input[i]; } } console.log(cleanStr); document.body.innerHTML = cleanStr; 


RegEx Approach:

You can create RegEx from a string using the RegExp constructor. To replace non-allowed characters, negated character class RegEx can be used.

var regex = new RegExp('[^' + allowed.join('') + ']', 'g');
var cleanStr = input.replace(regex, '');

Note: You'll need to escape meta-characters that have special meaning in the Character class.

Meta-characters that are needed to escape by preceding backslash \\ in the character class Quoting from www.regular-expressions.info .

In most regex flavors, the only special characters or metacharacters inside a character class are the closing bracket ( ] ), the backslash ( \\ ), the caret ( ^ ), and the hyphen ( - ).

 var input = 'this is A [{}].-_+~`:; *6^123@#$%&*()?{}|\\ '; var allowed = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 's', 'à', 'â', 'ä', 'è', 'é', 'ê', 'ë', 'î', 'ï', 'ô', 'œ', 'ù', 'û', 'ü', 'ÿ', 'ç', 'À', 'Â', 'Ä', 'È', 'É', 'Ê', 'Ë', 'Î', 'Ï', 'Ô', 'Œ', 'Ù', 'Û', 'Ü', 'Ÿ', 'Ç', ' ']; var regex = new RegExp('[^' + allowed.join('') + ']', 'gi'); console.log(regex); var cleanStr = input.replace(regex, ''); console.log(cleanStr); 

If the allowed characters array is fixed, you can use following RegEx to replace the non-allowed characters. Also, there is no need to convert the string to lower-case, use i flag for case-insensitive match.

var regex = /[^0-9a-zàâäèéêëîïôœùûüÿç ]/gi;

RegEx101 Live Demo

Using ES6's Set class, available in all good browsers :

let input = 'this is A [{}].-_+~`:; *6^123@#$%&*()?{}|\ '.toLowerCase();
let allowed = new Set(['0','1','2','3','4','5','6','7','8','9','a','b','c','d', 'e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','s','à','â','ä','è','é','ê','ë','î','ï','ô','œ','ù','û','ü','ÿ','ç','À','Â','Ä','È','É','Ê','Ë','Î','Ï','Ô','Œ','Ù','Û','Ü','Ÿ','Ç', ' ']);

let cleanStr = [].map.call(input, c => allowed.has(c) ? c : ' ').join('');

The last line uses an efficient Set lookup operation to determine if the character is allowed or not.

The [].map.call(input, ...) allows the Array.prototype.map function to operate directly on the input string. Since the result is an array, it needs to be join ed back together again afterwards.

In algorithmic complexity terms, this uses two O(n) array operations, and n Set lookups - I don't know what complexity they have but it'll be likely O(log n) or perhaps even O(1) , depending on the implementation.

The creation of the initial Set has a computation cost too, of course, but it's trivial and should be done just once, eg when the program starts.

If instead you actually wanted to remove the non-matching characters, you could use .filter instead of .map :

let cleanStr = [].filter.call(input, c => allowed.has(c)).join('');

Ok so the problem with your code is that every time you loop to check if an element of the input is allowed, you assign cleanStr to the input with only that character changed to an empty string element. Keep in mind that at every loop your input is always the same and clearStr is the result of the last replacement you did. So you are actually throwing away every replacement done so far and at the end of your computation you will have the input string with only the last replacement you did. What you wanna do is build the resulting string incrementally, so that at the end of the loop you have the result you expected.

var input = 'this is A [{}].-_+~`:; *6^123@#$%&*()?{}|\ ';
input.toLowerCase(input)

var allowed = ['0','1','2','3','4','5','6','7','8','9','a','b','c','d', 'e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','s','à','â','ä','è','é','ê','ë','î','ï','ô','œ','ù','û','ü','ÿ','ç','À','Â','Ä','È','É','Ê','Ë','Î','Ï','Ô','Œ','Ù','Û','Ü','Ÿ','Ç', ' '] 

var cleanStr = '';
for(var i = 0; i < input.length; i++){
  if(allowed.indexOf(input[i]) !== -1){
        cleanStr += input[i];
  }
}

console.log(cleanStr);

I thought it was important for you to understand what was your mistake. Other than the fact that you can use some builtin functions of js to avoid a double for loop for such a simple task. Although as many suggested a regex would be much more efficient.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM