I have the following regex:
[\u00BF-\u1FFF\u2C00-\uD7FF\w \""",.()/-<br\s/?>]+$
It allows characters of any language except special characters like #,*
etc.(although some special characters are allowed as you can see in the regex above).
However, my regex also allows unwanted special characters like <,>,&
.
How should I modify this regex to disallow these characters in the input string?
You need to use alternation for some of the regex parts ( <br\\s/?>
is treated as separate characters <
, b
, etc.), and /-<
is creating a range accepting many more characters than you think:
Thus, I suggest using
^(?:[\u00BF-\u1FFF\u2C00-\uD7FF\w ",.()/:;-]|"|<br\s?/?>)+$
In C#, using a verbatim string literal:
@"^(?:[\u00BF-\u1FFF\u2C00-\uD7FF\w "",.()/:;-]|"|<br\s?/?>)+$"
I am assuming you need to match either of the 3 "entities" or their combinations:
[\¿-\\Ⰰ-\\\w ",.()/-]
- Ranges of characters \¿-\
and \Ⰰ-\
, \\w
, a space, a double quote, ,
, .
, (
, )
, /
and a literal hyphen "
- A literal "
<br\\s?/?>
- <br>
tags (this can match <br>
, <br/>
and <br />
). ^
and $
will force matching at the beginning and end.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.