简体   繁体   English

用于特定数字前缀的正则表达式

[英]regex for specific digit prefix

I am trying to have the following regx rule, but couldn't find solution. 我正在尝试使用以下regx规则,但找不到解决方案。

I am sorry if I didn't make it clear. 如果我不清楚,我很抱歉。 I want for each rule different regx. 我要为每个规则使用不同的regx。 I am using Java. 我正在使用Java。

  • rule should fail for all digit inputs start with prefix '1900' or '1901'. 规则应失败,因为所有以“ 1900”或“ 1901”开头的数字输入。 (190011 - fail, 190111 - fail, 41900 - success...) (190011-失败,190111-失败,41900-成功...)

  • rule should success for all digit inputs with the prefix '*' 规则对于所有带前缀“ *”的数字输入应成功

different regex for each rule (I am not looking for the combination of both of them together) 每个规则使用不同的正则表达式(我不是在寻找两者的组合)

Is this RE fitting the purpose ? 此RE是否符合目的? :

'\A(\*|(?!190[01])).*'

\\A means 'the beginning of string' . \\ A表示“字符串的开头”。 I think it's the same in Java's regexes 我认为Java的正则表达式是相同的

.

EDIT 编辑

\\A : "from the very beginning of the string ....". \\A :“从字符串的最开始..”。 In Python (which is what I know, in fact) this can be omitted if we use the function match() that always analyzes from the very beginning, instead of search() that search everywhere in a string. 在Python中(事实上,这是我所知道的),如果我们使用始终从一开始就进行分析的match()函数,而不是在字符串中到处search()函数,则可以将其省略。 If you want the regex able to analyze lines from the very beginning of each line, this must be replaced by ^ 如果希望正则表达式能够从每一行的开始分析行,则必须用^代替

(...|...) : ".... there must be one of the two following options : ....." (...|...) :“ ....必须有以下两个选项之一:.....”

\\* : "...the first option is one character only, a star; ..." . \\* :“ ...第一个选项只能是一个字符,一个星号; ...”。 As a star is special character meaning 'zero, one or more times what is before' in regex's strings, it must be escaped to strictly mean 'a star' only. 由于星号是特殊字符在正则表达式的字符串中表示“零,是之前的零倍或更多” ,因此必须转义以严格表示“星”

(?!190[01]) : "... the second option isn't a pattern that must be found and possibly catched but a pattern that must be absent (still after the very beginning). ...". (?!190[01]) :“ ...第二个选项不是必须找到并可能被捕获的模式,而是必须不存在的模式(仍在开始之后)。 The two characters ?! 两个字符?! are what says 'there must not be the following characters' . 的意思是“不得有以下字符” The pattern not to be found is 4 integer characters long, '1900' or '1901' . 找不到的模式是4个整数字符, “ 1900”“ 1901”

(?!.......) is a negative lookahead assertion. (?!.......)是一个否定的超前断言。 All kinds of assertion begins with (? : the parenthese invalidates the habitual meaning of ? , that's why all assertions are always written with parentheses. 各种断言开始(?在parenthese无效的习惯性的意义? ,这就是为什么所有的断言总是用括号写的。

If \\* have matched, one character have been consumed. 如果\\*匹配,则消耗一个字符。 On the contrary, if the assertion is verified, the corresponding 4 first characters of the string haven't been consumed: the regex motor has gone through the analysed string until the 4th character to verify them, and then it has come back to its initial position, that is to say, presently, at the very beginning of the string. 相反,如果断言得到验证,则字符串的对应的前四个字符未被消耗:正则表达式马达经过分析的字符串,直到第四个字符对其进行验证,然后又返回到其初始位置。位置,也就是说,目前在字符串的开头。

If you want the bi-optional part (...|...) not to be a capturing group, you will write ?: just after the first paren, then '\\A(?:\\*|(?!190[01])).*' 如果您希望双可选部分(...|...)不是捕获组,则将在第一个括号后面写?: ,然后是'\\A(?:\\*|(?!190[01])).*'

.* : After the beginning pattern (one star catched/matched, or an assertion verified) the regex motor goes and catch all the characters until the end of the line. .* :在开始模式(捕捉到/匹配一颗星或验证了一个断言)之后,正则表达式马达开始运行并捕捉所有字符,直到行尾。 If the string has newlines and you want the regex to catch all the characters until the end of the string, and not only of a line, you will specify that . 如果字符串包含换行符,并且您希望正则表达式捕获所有字符,直到字符串的末尾(不仅是一行的末尾),还应指定. must match the newlines too (in Python it is with re.MULTILINE), or you will replace .* with (.|\\r|\\n)* 必须也与换行符匹配(在Python中,它与re.MULTILINE一起使用),否则您将.*替换为(.|\\r|\\n)*

I finally understand that you apparently want to catch strings composed of digits characters. 我终于明白,您显然想捕获由数字字符组成的字符串。 If so the RE must be changed to '\\A(?:\\*|(?!190[01]))\\d*' . 如果是这样,则必须将RE更改为'\\A(?:\\*|(?!190[01]))\\d*' This RE matches with empty strings. 此RE与空字符串匹配。 If you want no-match with empty strings, put \\d+ in place of \\d* . 如果要与空字符串不匹配,请使用\\d+代替\\d* If you want that only strings with at least one digit, even after the star when it begins with a star, match, then do ' \\A(?:\\*|(?!190[01]))(?=\\d)\\d*' 如果只希望字符串中至少包含一位数字,即使在以星号开头的星号之后也要匹配,则执行' \\A(?:\\*|(?!190[01]))(?=\\d)\\d*'

For the first rule, you should use a combo regex with two captures, one to capture the 1900/1901-prefixed case, and one the capture the rest. 对于第一个规则,您应该使用带有两个捕获的组合正则表达式,一个捕获1900/1901前缀的大小写,另一个捕获其余的。 Then you can decide whether the string should succeed or fail by examining the two captures: 然后,您可以通过检查两次捕获来确定字符串是成功还是失败:

(190[01]\d+)|(\d+)

Or just a simple 190[01]\\d+ and negate your logic. 或者只是一个简单的190[01]\\d+ ,否定您的逻辑。

Regex's are not really very good at excluding something. 正则表达式不是真的很擅长排除某些东西。

You may exclude a prefix using negative look-behind, but it won't work in this case because the prefix is itself a stream of digits. 您可以使用负向后查找来排除前缀,但是在这种情况下它将不起作用,因为前缀本身就是数字流。

You seem to be trying to exclude 1-900/901 phone numbers in the US. 您似乎试图在美国排除1-900 / 901电话号码。 If the number of digits is definite, you can use a negative look-behind to exclude this prefix while matching the remaining exact number digits. 如果位数是确定的,则可以在与其余的精确数字位数匹配的同时,使用负向后搜索来排除此前缀。

For the second rule, simply: 对于第二条规则,只需:

\\*\\d+

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM