简体   繁体   中英

Regular Expression for no repeating special characters (C#)

I am new to regular expressions and need a regular expression for address, in which user cannot enter repeating special characters such as: ..... or ,,,.../// etc and none of the special characters could be entered more than 5 times in the string.

...,,,....// =>No Match
Street no. 40. hello. =>Match

Thanks in advance!
I have tried this:

([a-zA-Z]+|[\s\,\.\/\-]+|[\d]+)|(\(([\da-zA-Z]|[^)^(]+){1,}\))

It selects all alphanumeric n some special character with no empty brackets.

You can use Negative lookahead construction that asserts what is invalid to match. Its format is (?! ... )

For your case you can try something like this:

This will not match the input string if it has 2 or more consecutive dots, commas or slashes (or any combination of them)

(?!.*[.,\/]{2}) ... rest of the regex 

This will not match the input string if it has more than 5 characters 'A'.

(?!(.*A.*){5}) ... rest of the regex 

This will match everything except your restrictions. Repplace last part (.*) with your regex.

^(?!.*[.,\/]{2})(?!(.*\..*){5})(?!(.*,.*){5})(?!(.*\/.*){5}).*$

Note: This regex may no be optimized. It may be faster if you use loop to iterate over string characters and count their occurences.

You can use this regex:

^(?![^,./-]*([,./-])\1)(?![^,./-]*([,./-])(?:[^,./-]*\2){4})[ \da-z,./-]+$

In C#:

foundMatch = Regex.IsMatch(yourString, @"^(?![^,./-]*([,./-])\1)(?![^,./-]*([,./-])(?:[^,./-]*\2){4})[ \da-z,./-]+$", RegexOptions.IgnoreCase);

Explanation

  • The ^ anchor asserts that we are at the beginning of the string
  • The negative lookahead (?![^,./-]*([,./-])\\1) asserts that it is not possible to match any number of special chars, followed by one special char (captured to Group 1) followed by the same special char (the \\1 backreference)
  • The negative lookahead (?![^,./-]*([,./-])(?:[^,./-]*\\2){4}) ` asserts that it is not possible to match any number of special chars, followed by one special char (captured to Group 2), then any non-special char and that same char from Group 2, four times (five times total)
  • The $ anchor asserts that we are at the end of the string

A regular expression string to detect invalid strings is:

[^\w \-\r\n]{2}|(?:[\w \-]+[^\w \-\r\n]){5}

As C# string literal (regular and verbatim):

"[^\\w \\-\\r\\n]{2}|(?:[\\w \\-]+[^\\w \\-\\r\\n]){5}"

@"[^\w \-\r\n]{2}|(?:[\w \-]+[^\w \-\r\n]){5}"

It is much easier to find a string than to validate if a string does not contain ...

It can be checked with this expression if the string entered by the user is invalid because of a match of 2 special characters in sequence OR 5 special characters used in the string.

Explanation:

[^ ... ] ... a negative character class definition which matches any character NOT being one of the characters listed within the square brackets.

\\w ... a word character which is either a letter, a digit or an underscore.

The next character is simply a space character.

\\- ... the hyphen character which must be escaped with a backslash within square brackets as otherwise the hyphen character would be interpreted as "FROM x TO z" (except when being the first or the last character within the square brackets).

\\r ... carriage return

\\n ... line-feed

Therefore [^\\w \\-\\r\\n] finds a character which is NOT a letter, NOT a digit, NOT an underscore, NOT a space, NOT a hyphen, NOT a carriage return and also NOT a line-feed.

{2} ... the preceding expression must match 2 such characters.

So with the expression [^\\w \\-\\r\\n]{2} it can be checked if the string contains 2 special characters in a sequence which makes the string invalid.

| ... OR

(?: ... ) ... none marking group needed here for applying the expression inside with the multiplier {5} at least 5 times.

[ ... ] ... a positive character class definition which matches any character being one of the characters listed within the square brackets.

[\\w \\-]+ ... find a word character, or a space, or a hyphen 1 or more times.

[^\\w \\-\\r\\n] ... and next character being NOT a word character, space, hyphen, carriage return or line-feed.

Therefore (?:[\\w \\-]+[^\\w \\-\\r\\n]){5} finds a string with 5 "special" characters between "standard" characters.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM