[英]Ruby Split string at character difference using regex
I'm current working on a problem that involves splitting a string by each group of characters. 我目前正在解决一个问题,该问题涉及按每组字符分割一个字符串。
For example, 例如,
"111223334456777" #=> ['111','22','333','44','5','6','777']
The way I am currently doing it now is using a enumerator and comparing each character with the next one, and splitting the array that way. 我目前的操作方式是使用枚举器,将每个字符与下一个字符进行比较,然后以这种方式拆分数组。
res = []
str = "111223334456777"
group = str[0]
(1...str.length).each do |i|
if str[i] != str[i-1]
res << group
group = str[i]
else
group << str[i]
end
end
res << group
res #=> ['111','22','333','44','5','6','777']
I want to see if I can use regex to do this, which will make this process a lot easier. 我想看看是否可以使用正则表达式来执行此操作,这将使此过程更加容易。 I understand I could just put this block of code in a method, but I'm curious if regex can be used here. 我知道我可以将这段代码放在一个方法中,但是我很好奇是否可以在这里使用正则表达式。
So what I want to do is 所以我想做的是
str.split(/some regex/)
to produce the same result. 产生相同的结果。 I thought about positive lookahead, but I can't figure out how to have regex recognize that the character is different. 我想到了积极的前瞻性,但是我不知道如何让正则表达式认识到字符是不同的。
Does anyone have an idea if this is possible? 有谁知道这是否可行?
str = "111333224456777"
str.scan /0+|1+|2+|3+|4+|5+|6+|7+|8+|9+/
#=> ["111", "333", "22", "44", "5", "6", "777"]
or 要么
str.scan(/((\d)\2*)/).map(&:first)
#=> ["111", "333", "22", "44", "5", "6", "777"]
Readers: can the latter be simplified? 读者:可以简化后者吗?
The chunk_while
method is what you're looking for here: chunk_while
方法就是您在这里寻找的:
str.chars.chunk_while { |b,a| b == a }.map(&:join)
That will break anything where the current character a
doesn't match the previous character b
. 这将破坏当前字符a
与先前字符b
不匹配的所有内容。 If you want to restrict to just numbers you can do some pre-processing. 如果您只想限制数字,则可以进行一些预处理。
There's a lot of very handy methods in Enumerable that are worth exploring, and each new version of Ruby seems to add more of them. Enumerable中有很多非常方便的方法值得探索,并且每个新版本的Ruby似乎都添加了更多方法。
Another option which utilises the group_by
method, which returns a hash with each individual number as a key and an array of grouped numbers as the value. 另一个利用group_by
方法的选项,该方法返回一个散列,其中每个单独的数字作为键,而一个分组数字的数组作为值。
"111223334456777".split('').group_by { |i| i }.values.map(&:join) => => ["111", "22", "333", "44", "5", "6", "777"]
Although it doesn't implement a regex, someone else may find it useful. 尽管它没有实现正则表达式,但其他人可能会发现它很有用。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.