简体   繁体   中英

Regex to select all characters that do not match a pattern

I'm weak with regexes but have put together the following regex which selects when my pattern is met, the problem is that i need to select any characters that do not fit the pattern.

/^\d{1,2}[ ]\d{1,2}[ ]\d{1,2}[ ][AB]/i

Correct pattern is:

## ## ## A|B aka [0 < x <= 90]*space*[0 < x <= 90] [0 < x <= 90] [A|B]

EG:

  • 12 34 56 A → good
  • 12 34 56 B → good
  • 12 34 5.6 A → bad - select .
  • 12 34 5.6 C → bad - select . and C
  • 1A 23 45 6 → bad - select A and 6

Edit: As my impression was that regex is used to perform validation of both characters and pattern/sequence at the same time. The simple question is how to select characters that do not fit the category of non negative numbers, spaces and distinct characters.

Answer 1

Brief

This isn't really realizable with 1 regex due to the nature of the regex. This answer provides a regex that will capture the last incorrect entry. For multiple incorrect entries, a loop must be used. You can correct the incorrect entries by running some code logic on the resulting captured groups to determine why it isn't valid.

My ultimate suggestion would be to split the string by a known delimiter (in this case the space character and then using some logic (or even a small regex) to determine why it's incorrect and how to fix it, as seen in Answer 2 .

Non-matches

The following logic is applied in my second answer.

For any users wondering what I did to catch incorrect matches: At the most basic level, all this regex is doing is adding |(.*) to every subsection of the regex. Some sections required additional changes for catching specific invalid string formats, but the |(.*) or slight modifications of this will likely solve anyone else's issues.

Other modifications include:

  • Using opposite tokens
    • For example: Matching a digit
      • Original regex: \\d
      • Opposite regex \\D
    • For example: Matching a digit or whitepace
      • Original regex: [\\d\\s]
      • Opposite regex: [^\\d\\s]
        • Note [\\D\\S] is incorrect as it matches both sets of characters, thus, any non-whitespace or non-digit character (since non-whitespace includes digits and non-digits include whitespace, both will be matched)
  • Negative lookaheads
    • For example: Catching up to 31 days in a month
      • Original regex \\b(?:[0-2]?\\d|3[01])\\b
      • Opposite regex: \\b(?![0-2]?\\d\\b|3[01]\\b)\\d+\\b

Code

First, creating a more correct regex that also ensures 0 < x <= 90 as per the OP's question.

^(?:(?:[0-8]?\d|90) ){3}[AB]$

See regex in use here

^(?:(?:(?:[0-8]?\d|90) |(\S*) ?)){3}(?:[AB]|(.*))$

Note : This regex uses the mi flags (multiline - assuming input is in that format, and case-insensitive)

Other Formats

Realistically, this following regex would be ideal. Unfortunately, JavaScript doesn't support some of the tokens used in the regex, but I feel it may be useful to the OP or other users that see this question.

See regex in use here

^(?:(?:(?:[0-8]?\d|90) |(?<n>\S*?) |(?<n>\S*?) ?)){3}(?:(?<n>\S*) )?(?:[AB]|(.*))$

Results

Input

The first section (sections separated by the extra newline/break) shows valid strings, while the second shows invalid strings.

0 45 90 A
0 45 90 B

-1 45 90 A
0 45 91 A
12 34 5.6 A
12 34 56 C
1A 23 45 6
11 1A 12 12 A
12 12  A
12 12 A

Output

0 45 90 A        VALID
0 45 90 B        VALID

-1 45 90 A       INVALID: -1
0 45 91 A        INVALID: 91
12 34 5.6 A      INVALID: 5.6
12 34 56 C       INVALID: C
1A 23 45 6       INVALID: 1A, 6
11 1A 12 12 A    INVALID: 12 A
12 12  A         INVALID: (missing value)
12 12 A          INVALID: A, (missing value)

Note : The last entry shows an odd output, but that's due to a limitation with JavaScript's regex engine. The Other Formats section describes this and another method to use to properly catch these cases (using a different regex engine)


Explanation

This uses a simple | (OR) and captures the incorrect matches into a capture group.

  • ^ Assert position at the start of the line
  • (?:(?:(?:[0-8]?\\d|90) |(\\S*) ?)){3} Match the following exactly 3 times
    • (?:(?:[0-8]?\\d|90) |(.+)) Match either of the following
      • (?:[0-8]?\\d|90) Match either of the following, followed by a space character literally
        • [0-8]?\\d Match between zero and one of the characters in the set 0-8 (a digit between 0 and 8 ), followed by any digit
        • 90 Match 90 literally
      • (\\S*) ? Capture any non-whitespace character one or more times into capture group 1, followed by zero or one space character literally
  • (?:[AB]|(.*)) Match either of the following
    • [AB] Match any character present in the set ( A or B )
    • (.*) Capture any character any number of times into capture group 2
  • $ Assert position at the end of the line


Answer 2

Brief

This method splits the string on the given delimiter and tests each section for the proper set of characters. It outputs a message if the value is incorrect. You would likely replace the console outputs with whatever logic you want use.

Code

 var arr = [ "0 45 90 A", "0 45 90 B", "-1 45 90 A", "0 45 91 A", "12 34 5.6 A", "12 34 56 C", "1A 23 45 6", "11 1A 12 12 A", "12 12 A", "12 12 A" ]; arr.forEach(function(e) { var s = e.split(" "); var l = s.pop(); var numElements = 3; var maxNum = 90; var syntaxErrors = []; if(s.length != numElements) { syntaxErrors.push(`Invalid number of elements: Number = ${numElements}, Given = ${s.length}`); } s.forEach(function(v) { if(v.match(/\\D/)) { syntaxErrors.push(`Invalid value "${v}" exists`); } else if(!v.length) { syntaxErrors.push(`An empty value or double space exists`); } else if(Number(v) > maxNum) { syntaxErrors.push(`Value greater than ${maxNum} exists: ${v}`); } }); if(l.match(/[^AB]/)) { syntaxErrors.push(`Last element ${l} in "${e}" is invalid`); } if(syntaxErrors.length) { console.log(`"${e}" [\\n\\t${syntaxErrors.join('\\n\\t')}\\n]`); } else { console.log(`No errors found in "${e}"`); } }); 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM