简体   繁体   中英

Remove non-digit characters from random string except first occurrence of #

This question looks trivial - but is not. I want using regexp to remove all non-digits characters from string without first # character. You can use below snippet (and edit magic function there) to tests:

 function magic(str) { // example hardcoded implementation - remove it and use proper regexp return str.replace(/#1234a5678b910/,'#12345678910'); } // Test tests = { // keys is input string, value is valid result for that input "#1234a5678b910": "#12345678910", "12#34a5678b910": "12#345678910", "1234a56#78b910": "123456#78910", "1234a5678b91#0": "1234567891#0", "1234a5678b91#0": "1234567891#0", "98#765a4321#039c": "98#7654321039", "98a765#4321#039c": "98765#4321039", "98a765b4321###39": "987654321#39", } Object.keys(tests).map(k=> console.log(`${k} Test: ${(''+(magic(k)==tests[k])).padEnd(5,' ').toUpperCase()} ( result is ${magic(k)} - should be ${tests[k]})`) );

The input string is generated in random way. I try this so far but no luck

function magic(str) {
   return str.replace(/(?<=#.*)[^0-9]/g, '') ;
}

Hot to do it using replace and regexp ?

Variable length lookbehinds only work in certain JavaScript engines (EMCA2018). See browser compatibility for lookbehind assertions here .

Regex method

For the engines that do support lookbehinds, you can use the following regex:

(?<!^[^#]*(?=#))\D+

Works as follows:

  • (?<!^[^#]*(?=#)) negative lookbehind ensuring the following does not match
    • ^ assert position at the start of the string
    • [^#]* match any character except # any number of times
    • (?=#) positive lookahead ensuring what follows is #
  • \\D+ match any non-digit character one or more times

In simpler terms, ^[^#]*(?=#) matches up to the position where the first # is encountered. We then negate these results (since we don't want to replace the first # in each string). And finally, we match the non-digit characters \\D+ that don't match those positions.

 function magic(str) { // example hardcoded implementation - remove it and use proper regexp return str.replace(/(?<!^[^#]*(?=#))\\D+/g,''); } // Test tests = { // keys is input string, value is valid result for that input "#1234a5678b910": "#12345678910", "12#34a5678b910": "12#345678910", "1234a56#78b910": "123456#78910", "1234a5678b91#0": "1234567891#0", "1234a5678b91#0": "1234567891#0", "98#765a4321#039c": "98#7654321039", "98a765#4321#039c": "98765#4321039", "98a765b4321###39": "987654321#39", } Object.keys(tests).map(k=> console.log(`${k} Test: ${(''+(magic(k)==tests[k])).padEnd(5,' ').toUpperCase()} ( result is ${magic(k)} - should be ${tests[k]})`) );


String manipulation method

This method works best for cross-browser support (older browsers or those that don't currently support EMCA2018).

This uses two regular expressions to clean both substrings:

[^\d#]+    # replace all characters that aren't digits or # (first substring)
\D+        # replace all non-digit characters (second substring)

 function magic(str) { // example hardcoded implementation - remove it and use proper regexp i = str.indexOf('#') || 0 x = str.substr(0,i+1) y = str.substr(i+1) r = x.replace(/[^\\d#]+/g,'')+y.replace(/\\D+/g,'') //console.log([i,x,y,r]) return r } // Test tests = { // keys is input string, value is valid result for that input "#1234a5678b910": "#12345678910", "12#34a5678b910": "12#345678910", "1234a56#78b910": "123456#78910", "1234a5678b91#0": "1234567891#0", "1234a5678b91#0": "1234567891#0", "98#765a4321#039c": "98#7654321039", "98a765#4321#039c": "98765#4321039", "98a765b4321###39": "987654321#39", } Object.keys(tests).map(k=> console.log(`${k} Test: ${(''+(magic(k)==tests[k])).padEnd(5,' ').toUpperCase()} ( result is ${magic(k)} - should be ${tests[k]})`) );

Pretty easy - match the part including the first # (if any) and just replace non digits in the second group. Afterwards, just glue them together.

 function magic(str) { let rx = /^([^#\\n]*\\#)(.*)/; let string = str.replace(rx, function(m, g1, g2) { if (g1.endsWith("#")) { part1 = g1.replace(/\\D+/g, "") + "#"; } else { part1 = g1.replace(/\\D+/g, ""); } return part1 + g2.replace(/\\D+/g, ""); }); return string; } // Test tests = { // keys is input string, value is valid result for that input "#1234a5678b910": "#12345678910", "12#34a5678b910": "12#345678910", "1234a56#78b910": "123456#78910", "1234a5678b91#0": "1234567891#0", "1234a5678b91#0": "1234567891#0", "98#765a4321#039c": "98#7654321039", "98a765#4321#039c": "98765#4321039", "98a765b4321###39": "987654321#39", } Object.keys(tests).map(k=> console.log(`${k} Test: ${(''+(magic(k)==tests[k])).padEnd(5,' ').toUpperCase()} ( result is ${magic(k)} - should be ${tests[k]})`) );

Use this regex to replace string ([a-zA-Z])|(?<=#(.*?))# . This matches all character in az and in AZ and # which are followed by another # and letters.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM