简体   繁体   中英

Disallow specific special characters in regex

I have the following regex:

[\u00BF-\u1FFF\u2C00-\uD7FF\w \&quot;"",.()/-<br\s/?>]+$

It allows characters of any language except special characters like #,* etc.(although some special characters are allowed as you can see in the regex above).

However, my regex also allows unwanted special characters like <,>,& .

How should I modify this regex to disallow these characters in the input string?

You need to use alternation for some of the regex parts ( <br\\s/?> is treated as separate characters < , b , etc.), and /-< is creating a range accepting many more characters than you think:

在此处输入图片说明

Thus, I suggest using

^(?:[\u00BF-\u1FFF\u2C00-\uD7FF\w ",.()/:;-]|&quot;|<br\s?/?>)+$

In C#, using a verbatim string literal:

@"^(?:[\u00BF-\u1FFF\u2C00-\uD7FF\w "",.()/:;-]|&quot;|<br\s?/?>)+$"

See demo on regexstorm

I am assuming you need to match either of the 3 "entities" or their combinations:

  • [\¿-\῿\Ⰰ-\퟿\\w ",.()/-] - Ranges of characters \¿-\῿ and \Ⰰ-\퟿ , \\w , a space, a double quote, , , . , ( , ) , / and a literal hyphen
  • &quot; - A literal &quot;
  • <br\\s?/?> - <br> tags (this can match <br> , <br/> and <br /> ).

^ and $ will force matching at the beginning and end.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM