简体   繁体   English

。*在正则表达式中起什么作用?

[英]What does .* do in regex?

After extensive search, I am unable to find an explanation for the need to use .* in regex. 经过广泛的搜索,我找不到在正则表达式中需要使用。*的解释。 For example, MSDN suggests a password regex of 例如, MSDN建议使用密码正则表达式为

@\"(?=.{6,})(?=(.*\d){1,})(?=(.*\W){1,})"

for length >= 6, 1+ digit and 1+ special character. 长度> = 6,则1+个数字和1+个特殊字符。

Why can't I just use: 为什么我不能只使用:

@\"(?=.{6,})(?=(\d){1,})(?=(\W){1,})"

.* just means "0 or more of any character" .*仅表示“任何字符中的0个或多个”

It's broken down into two parts: 它分为两部分:

  • . - a "dot" indicates any character -“点”表示任何字符
  • * - means "0 or more instances of the preceding regex token" * -表示“ 0个或多个正则表达式令牌实例”

In your example above, this is important, since they want to force the password to contain a special character and a number, while still allowing all other characters. 在上面的示例中,这很重要,因为他们希望强制密码包含特殊字符和数字,同时仍然允许所有其他字符。 If you used \\d instead of .* , for example, then that would restrict that portion of the regex to only match decimal characters ( \\d is shorthand for [0-9] , meaning any decimal). 例如,如果使用\\d而不是.* ,则这将限制正则表达式的该部分仅匹配十进制字符( \\d[0-9]简写,表示任何十进制)。 Similarly, \\W instead of .*\\W would cause that portion to only match non-word characters. 同样,用\\W代替.*\\W会使该部分仅匹配非单词字符。

A good reference containing many of these tokens for .NET can be found on the MSDN here: Regular Expression Language - Quick Reference 可以在以下MSDN上找到包含.NET的许多令牌的良好参考: 正则表达式语言-快速参考

Also, if you're really looking to delve into regex, take a look at http://www.regular-expressions.info/ . 另外,如果您确实想研究正则表达式,请访问http://www.regular-expressions.info/ While it can sometimes be difficult to find what you're looking for on that site, it's one of the most complete and begginner-friendly regex references I've seen online. 尽管有时可能很难在该站点上找到所需的内容,但这是我在网上看到的最完整,对初学者友好的正则表达式参考之一。

The .* portion just allows for literally any combination of characters to be entered. .*部分仅允许输入字符的任何组合。 It's essentially allowing for the user to add any level of extra information to the password on top of the data you are requiring 从本质上讲,它允许用户在所需数据的基础上向密码添加任何级别的额外信息。

Note: I don't think that MSDN page is actually suggesting that as a password validator. 注意:我认为MSDN页面实际上并不建议将其用作密码验证器。 It is just providing an example of a possible one. 它只是提供了一个可能的例子。

Just FYI, that regex doesn't do what they say it does, and the way it's written is needlessly verbose and confusing. 仅供参考,正则表达式不会按照他们所说的去做,它的编写方式不必要地冗长和混乱。 They say it's supposed to match more than seven characters, but it really matches as few as six. 他们说应该匹配七个以上的字符,但实际上只能匹配六个。 And while the other two lookaheads correctly match at least one each of the required character types, they can be written much more simply. 并且,尽管其他两个前瞻正确地匹配了至少一种所需的字符类型,但它们的编写却要简单得多。

Finally, the string you copied isn't just a regex, it's an XML attribute value (including the enclosing quotes) that seems to represent a C# string literal (except the closing quote is missing). 最后,您复制的字符串不仅仅是一个正则表达式,它是一个XML属性值(包括括起来的引号),似乎代表了C#字符串文字(除非缺少了引号)。 I've never used a Membership object, but I'm pretty sure that syntax is faulty. 我从未使用过Membership对象,但是我很确定语法是错误的。 In any case, the actual regex is: 无论如何,实际的正则表达式为:

(?=.{6,})(?=(.*\d){1,})(?=(.*\W){1,})

..but it should be: ..但是应该是:

(?=.{8,})(?=.*\d)(?=.*\W)

The first lookahead tries to match eight or more of any characters. 第一个前瞻会尝试匹配八个或更多字符。 If it succeeds, the match position (or cursor, if you prefer) is reset to the beginning and the second lookahead scans for a digit. 如果成功,则将匹配位置(或光标,如果需要)重置为开始位置,并且第二个超前扫描将扫描一个数字。 If it finds one, the cursor is reset again and the third lookahead scans for a special character. 如果找到一个,则光标将再次重置,并且第三个超前扫描将扫描一个特殊字符。 (Which, by the way, includes whitespace, control characters, and a boatload of other esoteric characters; probably not what the author intended.) (顺便说一下,其中包括空格,控制字符以及大量其他深奥的字符;可能不是作者想要的。)

If you left the .* out of the latter two lookaheads, you would have (?=\\d) asserting that the first character is a digit, and (?=\\W) asserting that it's not a digit. 如果在后两个前瞻中省略.* ,则将(?=\\d)断言第一个字符是数字,而(?=\\W)断言这不是数字。 (Digits are classed as word characters, and \\W matches anything that's not a word character.) The .* in each lookahead causes it to initially gobble up the whole string, then backtrack, giving back one character at a time until it reaches a spot where the \\d or \\W can match. (数字被分类为单词字符, \\W匹配不是单词字符的任何字符。)每个前瞻中的.*使其最初吞噬整个字符串,然后回溯,一次返回一个字符,直到到达一个字符为止。 \\d\\W可以匹配的位置。 That's how they can match the digit and the special character anywhere in the string. 这就是它们如何匹配数字和字符串中任意位置的特殊字符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM