I am writing an application/logic that has vocabulary word
/ phrase
as an input parameter. I am having troubles writing validation logic for this parameter's value !
Following are the rules I've came up with:
Few examples (in 3 languages):
// match:
one two three four
one-two-three-four
one-two-three four
vær så snill
тест регекс
re-read
under the hood
ONe
rabbit's lair
// not-match:
one two three four five
one two three four@
one-two-three-four five
rabbit"s lair
one' two's
one1
1900
Given the expected result provided above - could someone point me to right direction on how to create a validation rule like that? If that matters - I will be writing validation logic in C#
so I have more tools than just Regex
available at my disposal.
If that is going to be of any help - I have been testing several solutions, like these ^[\p{Ll}\p{Lt}]+$
and (?=\S*['-])([a-zA-Z'-]+)$
. The first regex seems to be doing a great job allowing just the letters I need (En, No and Rus), whereas the second rule set is doing great in using the Lookahead
concept.
\p{Ll}
or \p{Lowercase_Letter}
: a lowercase letter that has an uppercase variant. \p{Lu}
or \p{Uppercase_Letter}
: an uppercase letter that has a lowercase variant. \p{Lt}
or \p{Titlecase_Letter}
: a letter that appears at the start of a word when only the first letter of the word is capitalized. \p{L&}
or \p{Letter&}
: a letter that exists in lowercase and uppercase variants (combination of Ll, Lu and Lt). \p{Lm}
or \p{Modifier_Letter}
: a special character that is used like a letter. \p{Lo}
or \p{Other_Letter}
: a letter or ideograph that does not have lowercase and uppercase variants. Needless to say, neither of the solutions I have been testing take into account all the rules I defined above..
You can use
\A(?!(?:[^']*'){2})\p{L}+(?:[\s'-]\p{L}+){0,3}\z
See the regex demo . Details :
\A
- start of string (??(::[^']*'){2})
- the string cannot contain two apostrophes \p{L}+
- one or more Unicode letters (?:[\s'-]\p{L}+){0,3}
- zero to three occurrences of
[\s'-]
- a whitespace, '
or -
char \p{L}+
- one or more Unicode letters \z
- the very end of string. In C#, you can use it as
var IsValid = Regex.IsMatch(text, @"\A(?!(?:[^']*'){2})\p{L}+(?:[\s'-]\p{L}+");{0,3}\z")
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.