简体   繁体   English

正则表达式以选择与模式不匹配的所有字符

[英]Regex to select all characters that do not match a pattern

I'm weak with regexes but have put together the following regex which selects when my pattern is met, the problem is that i need to select any characters that do not fit the pattern. 我对正则表达式不满意,但是将以下正则表达式放在一起,可以在遇到我的模式时选择它,问题是我需要选择任何不适合该模式的字符。

/^\d{1,2}[ ]\d{1,2}[ ]\d{1,2}[ ][AB]/i

Correct pattern is: 正确的方式是:

## ## ## A|B aka [0 < x <= 90]*space*[0 < x <= 90] [0 < x <= 90] [A|B]

EG: 例如:

  • 12 34 56 A → good 12 34 56 A→好
  • 12 34 56 B → good 12 34 56 B→好
  • 12 34 5.6 A → bad - select . 12 34 5.6 A→错误选择
  • 12 34 5.6 C → bad - select . 12 34 5.6 C→错误选择 and C C
  • 1A 23 45 6 → bad - select A and 6 1A 23 45 6→不良-选择A6

Edit: As my impression was that regex is used to perform validation of both characters and pattern/sequence at the same time. 编辑:由于我的印象是正则表达式用于同时执行字符和图案/序列的验证。 The simple question is how to select characters that do not fit the category of non negative numbers, spaces and distinct characters. 一个简单的问题是如何选择不适合非负数,空格和不同字符类别的字符。

Answer 1 答案1

Brief 简要

This isn't really realizable with 1 regex due to the nature of the regex. 由于正则表达式的性质,使用1个正则表达式无法真正实现。 This answer provides a regex that will capture the last incorrect entry. 此答案提供了一个正则表达式,它将捕获最后一个不正确的条目。 For multiple incorrect entries, a loop must be used. 对于多个不正确的条目,必须使用循环。 You can correct the incorrect entries by running some code logic on the resulting captured groups to determine why it isn't valid. 您可以通过所产生的捕获组运行一些代码逻辑,以确定为什么它是不是有效纠正不正确的条目。

My ultimate suggestion would be to split the string by a known delimiter (in this case the space character 我的最终建议是用已知的定界符(在本例中为空格字符)分割字符串 and then using some logic (or even a small regex) to determine why it's incorrect and how to fix it, as seen in Answer 2 . 然后使用一些逻辑(甚至是小的正则表达式)来确定为什么不正确以及如何修复它,如答案2所示

Non-matches 不匹配

The following logic is applied in my second answer. 以下逻辑适用于我的第二个答案。

For any users wondering what I did to catch incorrect matches: At the most basic level, all this regex is doing is adding |(.*) to every subsection of the regex. 对于任何想知道我为捕获不正确的匹配该怎么做的用户:在最基本的级别上,此正则表达式所做的就是在正则表达式的每个子节中添加|(.*) Some sections required additional changes for catching specific invalid string formats, but the |(.*) or slight modifications of this will likely solve anyone else's issues. 有些部分需要进行其他更改才能捕获特定的无效字符串格式,但是|(.*)或对此稍加修改将很可能解决其他任何人的问题。

Other modifications include: 其他修改包括:

  • Using opposite tokens 使用相反的令牌
    • For example: Matching a digit 例如:匹配一个数字
      • Original regex: \\d 原始正则表达式: \\d
      • Opposite regex \\D 正则表达式\\D对面
    • For example: Matching a digit or whitepace 例如:匹配数字或空格
      • Original regex: [\\d\\s] 原始正则表达式: [\\d\\s]
      • Opposite regex: [^\\d\\s] 正则表达式对面: [^\\d\\s]
        • Note [\\D\\S] is incorrect as it matches both sets of characters, thus, any non-whitespace or non-digit character (since non-whitespace includes digits and non-digits include whitespace, both will be matched) 注意[\\D\\S] 不正确,因为它匹配两组字符,因此,任何非空格或非数字字符(因为非空格包括数字,非数字包括空格,所以两者都将匹配)
  • Negative lookaheads 负前瞻
    • For example: Catching up to 31 days in a month 例如:一个月最多可捕获31天
      • Original regex \\b(?:[0-2]?\\d|3[01])\\b 原始正则表达式\\b(?:[0-2]?\\d|3[01])\\b
      • Opposite regex: \\b(?![0-2]?\\d\\b|3[01]\\b)\\d+\\b 正则表达式对面: \\b(?![0-2]?\\d\\b|3[01]\\b)\\d+\\b

Code

First, creating a more correct regex that also ensures 0 < x <= 90 as per the OP's question. 首先,根据OP的问题,创建一个更正确的正则表达式,也可以确保0 < x <= 90

^(?:(?:[0-8]?\d|90) ){3}[AB]$

See regex in use here 查看正则表达式在这里使用

^(?:(?:(?:[0-8]?\d|90) |(\S*) ?)){3}(?:[AB]|(.*))$

Note : This regex uses the mi flags (multiline - assuming input is in that format, and case-insensitive) 注意 :此正则表达式使用mi标志(多行-假定输入采用该格式,并且不区分大小写)

Other Formats 其他格式

Realistically, this following regex would be ideal. 实际上,以下正则表达式将是理想的。 Unfortunately, JavaScript doesn't support some of the tokens used in the regex, but I feel it may be useful to the OP or other users that see this question. 不幸的是,JavaScript不支持正则表达式中使用的某些令牌,但我认为它对OP或其他看到此问题的用户很有用。

See regex in use here 查看正则表达式在这里使用

^(?:(?:(?:[0-8]?\d|90) |(?<n>\S*?) |(?<n>\S*?) ?)){3}(?:(?<n>\S*) )?(?:[AB]|(.*))$

Results 结果

Input 输入项

The first section (sections separated by the extra newline/break) shows valid strings, while the second shows invalid strings. 第一部分(由多余的换行符/换行符分隔的部分)显示有效字符串,第二部分显示无效字符串。

0 45 90 A
0 45 90 B

-1 45 90 A
0 45 91 A
12 34 5.6 A
12 34 56 C
1A 23 45 6
11 1A 12 12 A
12 12  A
12 12 A

Output 输出量

0 45 90 A        VALID
0 45 90 B        VALID

-1 45 90 A       INVALID: -1
0 45 91 A        INVALID: 91
12 34 5.6 A      INVALID: 5.6
12 34 56 C       INVALID: C
1A 23 45 6       INVALID: 1A, 6
11 1A 12 12 A    INVALID: 12 A
12 12  A         INVALID: (missing value)
12 12 A          INVALID: A, (missing value)

Note : The last entry shows an odd output, but that's due to a limitation with JavaScript's regex engine. 注意 :最后一个条目显示了一个奇怪的输出,但这是由于JavaScript的正则表达式引擎的限制所致。 The Other Formats section describes this and another method to use to properly catch these cases (using a different regex engine) 其他格式”部分介绍了此方法以及用于正确捕获这些情况的另一种方法(使用其他正则表达式引擎)


Explanation 说明

This uses a simple | 这使用一个简单的| (OR) and captures the incorrect matches into a capture group. (OR),并将不正确的匹配项捕获到捕获组中。

  • ^ Assert position at the start of the line ^在行首处声明位置
  • (?:(?:(?:[0-8]?\\d|90) |(\\S*) ?)){3} Match the following exactly 3 times (?:(?:(?:[0-8]?\\d|90) |(\\S*) ?)){3}精确匹配以下3次
    • (?:(?:[0-8]?\\d|90) |(.+)) Match either of the following (?:(?:[0-8]?\\d|90) |(.+))匹配以下任一
      • (?:[0-8]?\\d|90) Match either of the following, followed by a space character (?:[0-8]?\\d|90)匹配以下任意一个,后跟一个空格 literally 从字面上看
        • [0-8]?\\d Match between zero and one of the characters in the set 0-8 (a digit between 0 and 8 ), followed by any digit [0-8]?\\d匹配字符集0-8中的零个字符之一( 08之间的数字),后跟任意数字
        • 90 Match 90 literally 9090字面上
      • (\\S*) ? Capture any non-whitespace character one or more times into capture group 1, followed by zero or one space character 捕获任何非空白字符一次或多次到捕获组1中,然后捕获零个或一个空格字符 literally 从字面上看
  • (?:[AB]|(.*)) Match either of the following (?:[AB]|(.*))匹配以下任一
    • [AB] Match any character present in the set ( A or B ) [AB]匹配集合中存在的任何字符( AB
    • (.*) Capture any character any number of times into capture group 2 (.*)任意字符多次捕获到捕获组2中
  • $ Assert position at the end of the line $在行尾声明位置


Answer 2 答案2

Brief 简要

This method splits the string on the given delimiter and tests each section for the proper set of characters. 此方法在给定的定界符上分割字符串,并在每个部分测试正确的字符集。 It outputs a message if the value is incorrect. 如果值不正确,它将输出一条消息。 You would likely replace the console outputs with whatever logic you want use. 您可能会用想要使用的任何逻辑替换控制台输出。

Code

 var arr = [ "0 45 90 A", "0 45 90 B", "-1 45 90 A", "0 45 91 A", "12 34 5.6 A", "12 34 56 C", "1A 23 45 6", "11 1A 12 12 A", "12 12 A", "12 12 A" ]; arr.forEach(function(e) { var s = e.split(" "); var l = s.pop(); var numElements = 3; var maxNum = 90; var syntaxErrors = []; if(s.length != numElements) { syntaxErrors.push(`Invalid number of elements: Number = ${numElements}, Given = ${s.length}`); } s.forEach(function(v) { if(v.match(/\\D/)) { syntaxErrors.push(`Invalid value "${v}" exists`); } else if(!v.length) { syntaxErrors.push(`An empty value or double space exists`); } else if(Number(v) > maxNum) { syntaxErrors.push(`Value greater than ${maxNum} exists: ${v}`); } }); if(l.match(/[^AB]/)) { syntaxErrors.push(`Last element ${l} in "${e}" is invalid`); } if(syntaxErrors.length) { console.log(`"${e}" [\\n\\t${syntaxErrors.join('\\n\\t')}\\n]`); } else { console.log(`No errors found in "${e}"`); } }); 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM