简体   繁体   English

在 Ruby 中拆分具有多个分隔符的字符串

[英]Split a string with multiple delimiters in Ruby

Take for instance, I have a string like this:举个例子,我有一个这样的字符串:

options = "Cake or pie, ice cream, or pudding"

I want to be able to split the string via or , , , and , or .我希望能够通过or , , , and , or分割字符串。

The thing is, is that I have been able to do it, but only by parsing , and , or first, and then splitting each array item at or , flattening the resultant array afterwards as such:问题是,我已经能够做到这一点,但只能通过首先解析, and , or ,然后在or处拆分每个数组项,然后将结果数组展平,如下所示:

options = options.split(/(?:\s?or\s)*([^,]+)(?:,\s*)*/).reject(&:empty?);
options.each_index {|index| options[index] = options[index].sub("?","").split(" or "); }

The resultant array is as such: ["Cake", "pie", "ice cream", "pudding"]结果数组如下: ["Cake", "pie", "ice cream", "pudding"]

Is there a more efficient (or easier) way to split my string on those three delimiters?有没有更有效(或更简单)的方法可以在这三个分隔符上分割我的字符串?

What about the following:以下情况如何:

options.gsub(/ or /i, ",").split(",").map(&:strip).reject(&:empty?)
  • replaces all delimiters but the ,替换除 , 之外的所有分隔,
  • splits it at ,将其拆分为,
  • trims each characters, since stuff like ice cream with a leading space might be left修剪每个字符,因为可能会留下带有前导空格ice cream之类的东西
  • removes all blank strings删除所有空白字符串

First of all, your method could be simplified a bit with Array#flatten :首先,可以使用Array#flatten简化您的方法:

>> options.split(',').map{|x|x.split 'or'}.flatten.map(&:strip).reject(&:empty?)
=> ["Cake", "pie", "ice cream", "pudding"]

I would prefer using a single regex:我更喜欢使用单个正则表达式:

>> options.split /\s*, or\s+|\s*,\s*|\s+or\s+/
=> ["Cake", "pie", "ice cream", "pudding"]

You can use |您可以使用| in a regex to give alternatives, and putting , or first guarantees that it won't produce an empty item.在正则表达式中提供替代方案,并 put , or first 保证它不会产生空项目。 Capturing the whitespace with the regex is probably best for efficiency, since you don't have to scan the array again.使用正则表达式捕获空白可能是提高效率的最佳选择,因为您不必再次扫描数组。

As Zabba points out, you may still want to reject empty items, prompting this solution:正如 Zabba 指出的那样,您可能仍想拒绝空项目,提示此解决方案:

>> options.split(/,|\sor\s/).map(&:strip).reject(&:empty?)
=> ["Cake", "pie", "ice cream", "pudding"]

As "or" and "," does the same thing, the best approach is to tell the regex that multiple cases should be treated the same as a single case:由于"or"","做同样的事情,最好的方法是告诉正则表达式多个案例应该被视为一个案例:

options = "Cake or pie, ice cream, or pudding"
regex = /(?:\s*(?:,|or)\s*)+/
options.split(regex)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM