Assume I have a string that consists of multiple words. These words aren't separated by spaces, but every word starts with a capital letter. This type of naming convention is usually called "camel case". Some examples:
Now I want to split these strings into single words, so FirstNumberAfterACharacter
becomes ["First", "Number", "After", "A", "Character"]
for example.
Finding a regular expression that matches those strings is also quite easy: ^([AZ][az]*)+$
. But if I try to get all matches, this regular expression will only return the last match:
irb(main):003:0> /^([A-Z][a-z]*)+$/.match('FirstNumberAfterACharacter').captures
=> ["Character"]
irb(main):004:0> 'FirstNumberAfterACharacter'.scan(/^([A-Z][a-z]*)+$/)
=> [["Character"]]
So how do I get all matches, not just the last one?
I changed your regexp to:
start with a group (...)
that consists of single capital letter: [AZ]{1}
, follows by zero or more capital letters [^AZ]*
.
'FirstNumberAfterACharacter'.scan(/([A-Z][^A-Z]*)/).flatten(1)
You can use a regex that extract any kind of Unicode uppercase letter followed by any non-uppercase letters:
'FirstNumberAfterACharacter'.scan(/\p{Lu}\P{Lu}*/)
# => ["First", "Number", "After", "A", "Character"]
See the Ruby online demo .
Details :
\p{Lu}
- any Unicode letter \P{Lu}*
- zero or more ( *
) letters other than Unicode letters.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.