简体   繁体   中英

Ruby Split string at character difference using regex

I'm current working on a problem that involves splitting a string by each group of characters.

For example,

"111223334456777" #=> ['111','22','333','44','5','6','777']

The way I am currently doing it now is using a enumerator and comparing each character with the next one, and splitting the array that way.

res = []
str = "111223334456777"
group = str[0]
(1...str.length).each do |i|
  if str[i] != str[i-1]
    res << group
    group = str[i]
  else
    group << str[i]
  end
end
res << group
res #=> ['111','22','333','44','5','6','777']

I want to see if I can use regex to do this, which will make this process a lot easier. I understand I could just put this block of code in a method, but I'm curious if regex can be used here.

So what I want to do is

str.split(/some regex/)

to produce the same result. I thought about positive lookahead, but I can't figure out how to have regex recognize that the character is different.

Does anyone have an idea if this is possible?

str = "111333224456777"

str.scan /0+|1+|2+|3+|4+|5+|6+|7+|8+|9+/
  #=> ["111", "333", "22", "44", "5", "6", "777"]

or

str.scan(/((\d)\2*)/).map(&:first)
  #=> ["111", "333", "22", "44", "5", "6", "777"] 

Readers: can the latter be simplified?

The chunk_while method is what you're looking for here:

str.chars.chunk_while { |b,a| b == a }.map(&:join)

That will break anything where the current character a doesn't match the previous character b . If you want to restrict to just numbers you can do some pre-processing.

There's a lot of very handy methods in Enumerable that are worth exploring, and each new version of Ruby seems to add more of them.

Another option which utilises the group_by method, which returns a hash with each individual number as a key and an array of grouped numbers as the value.

"111223334456777".split('').group_by { |i| i }.values.map(&:join) => => ["111", "22", "333", "44", "5", "6", "777"]

Although it doesn't implement a regex, someone else may find it useful.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM