简体   繁体   English

正则表达式以确保在诸如“ 05123:12315”的字符串中,第一个数字小于第二个?

[英]Regex to ensure that in a string such as “05123:12315”, the first number is less than the second?

I must have strings in the format x:y where x and y have to be five digits (zero padded) and x <= y . 我必须具有格式为x:y的字符串,其中xy必须为五位数(填充零)并且x <= y

Example: 例:

00515:02152

What Regex will match this format? 什么正则表达式将与此格式匹配?

If possible, please explain the solution briefly to help me learn. 如果可能,请简要说明解决方案以帮助我学习。

EDIT: Why do I need Regex? 编辑:为什么我需要正则表达式? I've written a generic tool that takes input and validates it according to a configuration file. 我编写了一个通用工具,该工具接受输入并根据配置文件对其进行验证。 An unexpected requirement popped up that would require me to validate a string in the format I've shown (using the configuration file). 弹出一个意外的要求,要求我以显示的格式(使用配置文件)验证字符串。 I was hoping to solve this problem using the existing configuration framework I've coded up, as splitting and parsing would be out of the scope of this tool. 我希望使用我编写的现有配置框架解决此问题,因为拆分和解析将超出此工具的范围。 For an outstanding requirement such as this, I don't mind having some unorthodox/messy regex, as long as it's not 10000 lines long. 对于诸如此类的突出要求,我不介意使用一些非正则/混乱的正则表达式,只要它的长度不超过10000行。 Any intelligent solutions using Regex are appreciated! 感谢使用Regex的任何智能解决方案! Thanks. 谢谢。

Description 描述

This expression will validate that the first 5 digit number is smaller then the second 5 digit number where zero padded 5 digit numbers are in a : delimited string and is formatted as 01234:23456 . 此表达式将验证第一个5位数字是否小于第二个5位数字,其中在0 :分隔字符串01234:23456零填充的5位数字,其格式为01234:23456

^
(?:
(?=0....:[1-9]|1....:[2-9]|2....:[3-9]|3....:[4-9]|4....:[5-9]|5....:[6-9]|6....:[7-9]|7....:[8-9]|8....:[9])
|(?=(.)(?:0...:\1[1-9]|1...:\1[2-9]|2...:\1[3-9]|3...:\1[4-9]|4...:\1[5-9]|5...:\1[6-9]|6...:\1[7-9]|7...:\1[8-9]|8...:\1[9]))
|(?=(..)(?:0..:\2[1-9]|1..:\2[2-9]|2..:\2[3-9]|3..:\2[4-9]|4..:\2[5-9]|5..:\2[6-9]|6..:\2[7-9]|7..:\2[8-9]|8..:\2[9]))
|(?=(...)(?:0.:\3[1-9]|1.:\3[2-9]|2.:\3[3-9]|3.:\3[4-9]|4.:\3[5-9]|5.:\3[6-9]|6.:\3[7-9]|7.:\3[8-9]|8.:\3[9]))
|(?=(....)(?:0:\4[1-9]|1:\4[2-9]|2:\4[3-9]|3:\4[4-9]|4:\4[5-9]|5:\4[6-9]|6:\4[7-9]|7:\4[8-9]|8:\4[9]))
)
\d{5}:\d{5}$

Live demo: http://www.rubular.com/r/w1QLZhNoEa 现场演示: http//www.rubular.com/r/w1QLZhNoEa

Note that this is using the x option to ignore all white space and allow comments, if you use this without x then the expression will need to be all on one line 请注意,这是使用x选项忽略所有空白并允许注释,如果不使用x使用此表达式,则表达式必须全部放在一行上

在此处输入图片说明

The language you want to recognize is finite , so the easiest thing to do is just list all the cases separated by "or". 您要识别的语言是有限的 ,因此最简单的方法就是列出所有用“或”分隔的情况。 The regexp you want is: 您想要的正则表达式为:

(00000:[00000|00001| ... 99999])|  ...  |(99998:[99998|99999])|(99999:99999)

That regexp will be several billion characters long and take quite some time to execute, but it is what you asked for: a regular expression that matches the stated language. 该正则表达式将长达数十亿个字符,并且需要花费相当长的时间才能执行,但这正是您要的:与指定语言匹配的正则表达式。

Obviously that's impractical. 显然,这是不切实际的。 Now is it clear why regular expressions are the wrong tool for this job? 现在很清楚为什么正则表达式是此工作的错误工具? Use a regular expression to match 5 digits - colon - five digits, and then once you know you have that, split up the string and convert the two sets of digits to integers that you can compare. 使用正则表达式将5位数字(冒号)与5位数字匹配,然后,一旦知道了这一点,就可以分割字符串并将两组数字转换为可以比较的整数

x <= y. x <= y。

Well, you are using wrong tool. 好吧,您使用了错误的工具。 Really, regex can't help you here. 确实,正则表达式无法在这里为您提供帮助。 Or even if you get a solution, that will be too complex, and will be too difficult to expand. 甚至即使您找到解决方案,也将过于复杂,并且很难扩展。

Regex is a text-processing tool to match pattern in regular languages. 正则表达式是一种文本处理工具,用于匹配常规语言中的模式。 It is very weak when it comes to semantics . 语义方面 ,它非常薄弱 It cannot identify meaning in the given string. 它无法识别给定字符串中的含义 Like in your given condition, to conform to x <= y condition, you need to have the knowledge of their numerical values. 像在给定条件下一样,要符合x <= y条件,您需要了解它们的数值。

For eg , it can match digits in a sequence, or a mix of digits and characters, but what it cannot do is the stuff like - 例如 ,它可以按顺序匹配数字,也可以匹配数字和字符,但是它不能做的是-

  • match a number greater than 15 and less than 1245 , or 匹配大于15且小于1245 ,或
  • match a pattern which is a date between given two dates . 匹配模式,该模式是给定两个日期之间的日期

So, where-ever matching a pattern, involves applying semantics to the matched string, Regex is not an option there. 因此,无论匹配模式如何,只要将语义应用于匹配的字符串,Regex都不是可行的选择。

The appropriate way here would be to split the string on colon , and then compare numbers. 此处合适的方法是在colonsplit字符串,然后比较数字。 For leading zero, you can find some workaround. 对于前导零,您可以找到一些解决方法。

You can't generally* do this with regex. 通常,您不能*使用正则表达式执行此操作。 You can use regex to match the pattern and extract the numbers, then compare the numbers in your code. 您可以使用正则表达式匹配模式并提取数字,然后在代码中比较数字。

For example to match such format (without comparing the numbers) and get the numbers you could use: 例如,匹配这种格式(不比较数字)并获取数字,您可以使用:

^(\d{5}):(\d{5})\z

*) You probably could in this case (as the numbers are always 5 digits and zero padded, but it wouldn't be nice. *)在这种情况下,您可能可以这样做(因为数字始终是5位数字,并且填充了零,但这并不是很好。

You should do something like this instead: 您应该改为执行以下操作:

bool IsCorrect(string s)
{
    string[] split = s.split(':');
    int number1, number2;
    if (split.Length == 2 && split[0].Length == 5 && split[1].Length == 5)
    {
        if (int.TryParse(split[0], out number1) && int.TryParse(split[1], out number2) && number1 <= number2)
        {
            return true;
        }
    }
    return false;
}

With regex you can't make comparisons to see if a number is bigger than another number. 使用正则表达式时,您无法进行比较以查看一个数字是否大于另一个数字。

Let me show you a good example of why you shouldn't try to do this. 让我向您展示一个很好的例子,说明为什么您不应该尝试这样做。 This is a regex that (nearly) does the same job. 这是一个(几乎)完成相同工作的正则表达式。

https://gist.github.com/anonymous/ad74e73f0350535d09c1 https://gist.github.com/anonymous/ad74e73f0350535d09c1

Raw file: 原始文件:

https://gist.github.com/anonymous/ad74e73f0350535d09c1/raw/03ea835b0e7bf7ac3c5fb6f9c7e934b83fb09d95/gistfile1.txt https://gist.github.com/anonymous/ad74e73f0350535d09c1/raw/03ea835b0e7bf7ac3c5fb6f9c7e934b83fb09d95/gistfile1.txt

Except it's just for 3 digits. 除了仅3位数字。 For 4, the program that generates these fails with an OutOfMemoryException . 对于4,生成这些的程序将失败,并显示OutOfMemoryException With gcAllowVeryLargeObjects enabled. 启用gcAllowVeryLargeObjects It went on until 5GB until it crashed. 它一直持续到5GB直到崩溃。 You don't want most of your app to be a Regex, right? 您不希望您的大多数应用成为正则表达式,对吗?

This is not a Regex's job. 不是 Regex的工作。

This is a two step process because regex is a text parser and not analyzer. 这是一个两步过程,因为正则表达式是文本解析器而不是分析器。 But with that said, Regex is perfect for validating that we have the 5:5 number pattern and this regex pattern will determine if we have that form factor \\d\\d\\d\\d\\d:\\d\\d\\d\\d\\d right. 话虽这么说,正则表达式非常适合验证我们是否具有5:5的数字模式,而该正则表达式模式将确定我们是否具有该形状因子\\ d \\ d \\ d \\ d \\ d:\\ d \\ d \\ d \\ d \\ d对。 If that form factor is not found then a match fails and the whole validation fails. 如果找不到该尺寸,则匹配失败并且整个验证失败。 If it is valid, we can use regex/linq to parse out the numbers and then check for validity. 如果有效,我们可以使用regex / linq解析出这些数字,然后检查其有效性。

This code would be inside a method to do the check 此代码将在执行检查的方法中

var data = "00515:02151";
var pattern = @"
^               # starting from the beginning of the string...
(?=[\d:]{11})   # Is there is a string that is at least 11 characters long with only numbers and a ;, fail if not
(?=\d{5}:\d{5}) # Does it fall into our pattern? If not fail the match
((?<Values>[^:]+)(?::?)){2}  
";

// IgnorePatternWhitespace only allows us to comment the pattern, it does not affect the regex parsing
var result = Regex.Matches(data, pattern, RegexOptions.IgnorePatternWhitespace)
                  .OfType<Match>()
                  .Select (mt =>  mt.Groups["Values"].Captures
                                                .OfType<Capture>()
                                                .Select (cp => int.Parse(cp.Value)))
                  .FirstOrDefault();

// Two values at this point 515, 2151

bool valid = ((result != null) && (result.First () < result.Last ()));

Console.WriteLine (valid); // True

Using Javascript this can work. 使用Javascript可以正常工作。

    var string = "00515:02152";

    string.replace(/(\d{5})\:(\d{5})/, function($1,$2,$3){ 
         return (parseInt($2)<=parseInt($3))?$1:null;
    });

FIDDLE http://jsfiddle.net/VdzF7/ FIDDLE http://jsfiddle.net/VdzF7/

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 DocumentDB - 输入名称无效。确保提供小于'255'字符的唯一非空字符串 - DocumentDB - The input name is invalid. Ensure to provide a unique non-empty string less than '255' characters 如果字符串的长度小于15,如何获取字符串的前15个字符或更少的字符? - How can I get the first 15 characters of a string or less if the string is less than 15 in length? 在Android上使用HttpClient进行的第一个请求花费了超过45秒的时间,之后不到第二秒 - First request with HttpClient on Android takes more than 45 seconds, after that less than second ContextMenu显示不到1秒 - ContextMenu shows less than 1 second 当第二个数字大于第一个数字时,为什么Mod运算符返回第一个数字? - Why does Mod operator returns the first number when the second number is larger than the first? C#正则表达式如何提取字符串的第一部分,第二部分和最后一部分 - C# Regex how to extract first, second and last part of a string 简单的C#:如何检查字符串输入既是有效数字又小于特定数字? - Simple C#: How to check String input is both a valid number AND less than a certain number? 匹配输入中带引号的第一个数字/单词/字符串-Regex帮助 - Match the first number/word/string in quotation marks in the Input - Regex Help RegEx匹配第二行中的数字 - RegEx to match a number in the second line 不管字符串在任何数量的不同索引处具有1个或更少的撇号,还是根本没有,则该正则表达式将与字符串匹配吗? - What would the regex be to match a string whether it has 1 or less apostrophes at any number of different indexes or none at all?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM