Hi guys have the following regex:
/([A-Z][\w-]*(\s+[A-Z][\w-]*)+)/
I've tried in different way, but i'm not a pro with regex..so, this is what want to do:
Can you help me to do that? (I should do different regex, am i right?)
In order to help you to understand, this is what you have:
[AZ]
: one character in the class AZ [\\w-]*
: a concatenation of zero or more word character or hypens (...)+
: one or more:
\\s+
: at least one space [AZ]
: one character in the class AZ [\\w-]*
: a concatenation of zero or more word character or hypens This is what you want:
[AZ]
: a capital letter [\\w-]*
: a concatenation of zero or more word character or hypens \\s+
: at least one space [az]
: a lower-case letter [\\w-]*
: a concatenation of zero or more word character or hypens \\s+
: at least one space [AZ]
: a capital letter [\\w-]*
: a concatenation of zero or more word character or hypens That is:
[A-Z][\w-]*\s+[a-z][\w-]*\s+[A-Z][\w-]*
You may want to do some small changes. I think you can do them by your own.
A rule that matches only 3+ characters word is \\w{3,}
. If you want to capitalize the first character use [AZ]\\w{2,}
.
(\\w\\w\\w+)|(\\w+ [az]+ \\w+)
- This code searches for a word consisting of at least 3 letters OR a word with at least 1 sign, space, small letters, 1+ signs. You can switch \\w
with [AZ]
if necessary. If your 3 word phrase has to have 2 words with capital letters, change the second brackets to ([AZ]\\w* [az]+ [AZ]\\w*)
. Try it here: https://regex101.com/r/E3IPTj/1
Not sure on the scope of your limitations but a few 'building blocks' might help. Also id suggest just starting at the beginning I don't know any recent websites that handle learning regex well but when I started I used the following http://www.regular-expressions.info/tutorial.html (It's been many years, and the website does reflect its age so to speak)
Following your example: Institute of Technology
You need to know just a few things, character sets (and how to use matching length) and the space.
Character sets match one length (by default) and are done like for example [abc]
that will match a, b, or c, and also supports character ranges (az)/grouped (eg. \\d all digits). The match length can be changed by using the:
+
- one or more (examples: a+, [abc]+, \\d+) *
- zero or more (examples: a*, [abc]*) And this one you might want but thats up to you
{min, max}
- specific range, eg. b{3,5} will match 3-5 joined 'b' characters (bbb, bbbb, bbbbb) max can be omitted `{min,} to have at least min chars but no max Spaces are done using " " (a space), (
\\s
matches any whitespace character (equal to [\\r\\n\\t\\f\\v ]
) (spaces, tabs, newlines, ...)
In your example its a matter of case sensitive or not if not case sensitive we can use a simple [A-Za-z]+
to match upper and lowercase az of at least one length, together with the space we get something along the lines of
/[A-Za-z]+ [A-Za-z]+ [A-Za-z]+/
It's that simple. For case insensitive matching there is also an option flag, we can use i
which will result in
/[a-z]+ [a-z]+ [a-z]+/i
If you do want to have case sensitive matching you will need to separate them how you like:
/[A-Z][a-z]* [a-z]+ [A-Z][a-z]*/ // (*A a A*)
As a small change I've also changed +
into *
so the lowercase part is not required, again up to you.
Also note that to match the beginning of a string your required to use ^
and to match the end of a string use $
the above examples will match any segment, not the whole input eg: qhg8Institute of Technology8tghagus
would work
/^[A-Z][a-z]* [a-z]+ [A-Z][a-z]*$/ // case sensitive (Aa a Aa)
/^[a-z]+ [a-z]+ [a-z]+$/i // case insensitive
Obviously there is lots more to learn that can be used to expand/ optimize this but regex are so customizable its really up to the person needing them to specify his/ her limitations/ requirements.
As a side note I noticed people using \\w
for word chars, but this also includes digits, _, and special language letters like à, ü, etc. Again up to you what to do with this.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.