简体   繁体   English

正则表达式“。* [^ a-zA-Z0-9 _]。*”

[英]regular expression “.*[^a-zA-Z0-9_].*”

As I am trying to read more about regular expressions in C#, I just want to make sure of my conclusion that I made. 当我尝试阅读有关C#中的正则表达式的更多信息时,我只想确定自己得出的结论。 for the following expression ".*[^a-zA-Z0-9_].* ", the " .* " at the beginning and end are useless, is that right ? 对于以下表达式“。* [^ a-zA-Z0-9 _]。*”,开头和结尾处的“。*”没有用,对吗? because as I understood, that ".*" means zero or more occurrence of any character, but being followed by "[^a-zA-Z0-9_]" which means any character other than any combination of letters and digits case insensitive, makes ".*" useless to be added before and after "[^a-zA-Z0-9_]", is that right ? 因为据我所知,“。*”表示零个或多个字符出现,但后跟“ [^ a-zA-Z0-9_]”表示除字母和数字的任何组合以外的任何字符,不区分大小写,使“。*”在“ [^ a-zA-Z0-9_]”之前和之后都无用,对吗?

Here is the code I am using to check if the expressions matches 这是我用来检查表达式是否匹配的代码

// Here we call Regex.Match.
Match match = Regex.Match("anytest#", ".*[^a-z A-Z0-9_].*");
//Match match = Regex.Match("anytest#", "[^a-z A-Z0-9_]");

// Here we check the Match instance.
if (match.Success)
    Console.WriteLine("error");
else
    Console.WriteLine("no error");

The only difference would be whether the "margin characters" will be included in the result or not. 唯一的区别是结果中是否包含“空白字符”。

For: 对于:

ab41--_71j

It will match: 它将匹配:

1--_7

And without the .* at beginning and end it will match: 并且在开头和结尾都没有.*时,它将匹配:

--_

Any string will match the .*[^a-zA-Z0-9_].* regex at least once as long as it has at least one character that isn't a-zA-Z0-9_ 只要有至少一个不是a-zA-Z0-9_的字符,任何字符串都将至少匹配一次.*[^a-zA-Z0-9_].*正则表达式.*[^a-zA-Z0-9_].*

From your currently last comment in your answer, I understand that you actually use: 根据您对答案的当前最新评论,我了解到您实际上在使用:

^[a-zA-Z0-9]*$

This will match only if all characters are digit/letters. 仅当所有字符均为数字/字母时,这才匹配。 If it doesn't match, then the string is invalid. 如果不匹配,则该字符串无效。

If you also want to allow the _ character, then use: 如果还希望允许_字符,请使用:

^[a-zA-Z0-9_]*$

Which can even be shortened to: 甚至可以缩短为:

^\\w$

In general, it is better to make regex's Validate rather than Invalidate strings. 通常,最好使正则表达式的Validate而不是Invalidate字符串。 It just makes more sense and is more intuitive. 它更有意义,更直观。

So my validation would look like: 所以我的验证看起来像:

if (Regex.IsMatch("anytest#", "^\\w$"))
{
    Console.WriteLine("Success");
}
else
{
    Console.WriteLine("Error");
}

Another option that is probably faster: 另一个可能更快的选择:

if ("anytest#".ToCharArray().All(c => char.IsLetterOrDigit(c) || c == '_'))
{
    Console.WriteLine("Success");
}
else
{
    Console.WriteLine("Error");
}

And if you don't want '_' to be included, it can even look nicer; 而且,如果您不希望包含“ _”,它甚至看起来会更好。

if ("anytest#".ToCharArray().All(char.IsLetterOrDigit))
{
    Console.WriteLine("Success");
}
else
{
    Console.WriteLine("Error");
}

No, because there are other characters than aZ and 0-9 . 否,因为除了aZ0-9之外还有其他字符。

That regex matches all strings that start with any characters followed not by a-zA-Z0-9 and end with any characters. 该正则表达式匹配所有以任何字符开头,后接a-zA-Z0-9并以任何字符结尾的字符串。 Or just a string that does not contain a-zA-Z0-9 at all. 或者只是一个根本不包含a-zA-Z0-9的字符串。

If you leave the .* then you just have a regex that matches a charatcer that does not contain a-zA-Z0-9 at all. 如果您保留.*则只有一个正则表达式匹配一个完全不包含a-zA-Z0-9的字符。

.*[^a-zA-Z0-9_].*  matches for instance: ABC_ß_ABC
[^a-zA-Z0-9_]      matches for instance: ß   (and this regex just matches 1 character)

.*[^a-zA-Z0-9_].* will match the entire input as long as there is a non-alphanumeric/underscore somewhere in the input. .*[^a-zA-Z0-9_].*将匹配整个输入,只要输入中的某处有非字母数字/下划线。 [^a-zA-Z0-9_] will match only a single non-alphanumeric/underscore character (most likely the last one, if you're using the default greedy matching) if it is somewhere in the input. [^a-zA-Z0-9_]位于输入中的某个位置,则仅会匹配单个非字母数字/下划线字符(如果使用默认的贪婪匹配,则很可能是最后一个)。 Which one you want depends on the input and what you want to do once you find out if a non-alphanumeric/underscore character exists in the input. 您想要哪一个取决于输入,一旦发现输入中是否存在非字母数字/下划线字符,您将要做什么。

Input 1 : ABC_ß_ABC 输入1: ABC_ß_ABC

Input 2 : ß 输入2: ß

Regex 1: .*[^a-zA-Z0-9_].* Regex 2: [^a-zA-Z0-9_] 正则表达式1: .*[^a-zA-Z0-9_].*表达式2: [^a-zA-Z0-9_]

Both the inputs match both the regex, 两个输入都匹配正则表达式,

For input 1 对于输入1

Regex 1 matches 9 characters 正则表达式1匹配9个字符

Regex 2 matches only 1 character 正则表达式2仅匹配1个字符

Only include those tokens in the Regex that you are actually looking for. 只在您实际要查找的正则表达式中包括这些令牌。 In your case you didn't actually care whether there are any other characters before or after the excluding character class you specified. 在您的情况下,您实际上并不关心在指定的排除字符类之前或之后是否还有其他字符。 Adding .* before and after that doesn't change the success of the match, but makes matching more complicated. 在此之前和之后添加.*不会改变匹配的成功,但是会使匹配更加复杂。 A Regex matches anywhere already, unless you specifically anchor it somehow, eg using ^ at the start. 正则表达式已经可以在任何地方匹配,除非您以某种方式专门将其锚定,例如在开始时使用^

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM