简体   繁体   English

由多个分隔符拆分

[英]Split by multiple delimiters

I'm receiving a string that contains two numbers in a handful of different formats: 我收到一个字符串,其中包含两种不同格式的数字:

"344, 345" , "334,433" , "345x532" and "432 345" "344, 345" "334,433""345x532""345x532""432 345"

I need to split them into two separate numbers in an array using split , and then convert them using Integer(num) . 我需要把它们分为两个单独的号码在使用阵列split ,然后使用它们转换Integer(num)

What I've tried so far: 到目前为止我尝试过的:

nums.split(/[\s+,x]/) # split on one or more spaces, a comma or x

However, it doesn't seem to match multiple spaces when testing. 但是,测试时似乎不匹配多个空格。 Also, it doesn't allow a space in the comma version shown above ( "344, 345" ). 此外,它不允许上面显示的逗号版本中的空格( "344, 345" )。

How can I match multiple delimiters? 如何匹配多个分隔符?

You are using a character class in your pattern, and it matches only one character. 您在模式中使用了一个字符类,它只匹配一个字符。 [\\s+,x] matches 1 whitespace, or a + , , or x . [\\s+,x]匹配1个空格或+ ,x You meant to use (?:\\s+|x) . 你打算使用(?:\\s+|x)

However, perhaps, a mere \\D+ (1 or more non-digit characters) should suffice: 但是,或许,仅仅\\D+ (1个或更多非数字字符)就足够了:

"345, 456".split(/\D+/).map(&:to_i)
R1 = Regexp.union([", ", ",", "x", " "])
  #=> /,\ |,|x|\ /
R2 = /\A\d+#{R1}\d+\z/
  #=> /\A\d+(?-mix:,\ |,|x|\ )\d+\z/

def split_it(s)
  return nil unless s =~ R2
  s.split(R1).map(&:to_i)
end

split_it("344, 345") #=> [344, 345] 
split_it("334,433")  #=> [334, 433] 
split_it("345x532")  #=> [345, 532] 
split_it("432 345")  #=> [432, 345] 
split_it("432&345")  #=> nil
split_it("x32 345")  #=> nil

Your original regex would work with a minor adjustment to move the '+' symbol outside the character class: 你的原始正则表达式可以通过一个小调整来移动字符类之外的'+'符号:

"344 ,x  345".split(/[\s,x]+/).map(&:to_i) #==> [344,345]

If the examples are actually the only formats that you'll encounter, this will work well. 如果这些示例实际上是您将遇到的唯一格式,那么这将很有效。 However, if you have to be more flexible and accommodate unknown separators between the numbers, you're better off with the answer given by Wiktor: 但是,如果您必须更灵活并且在数字之间容纳未知的分隔符,那么您最好使用Wiktor给出的答案:

"344 ,x  345".split(/\D+/).map(&:to_i) #==> [344,345]

Both cases will return an array of Integers from the inputs given, however the second example is both more robust and easier to understand at a glance. 两种情况都会从给定的输入返回一个整数数组,但第二个例子更加健壮,一目了然更容易理解。

it doesn't seem to match multiple spaces when testing 它在测试时似乎不匹配多个空格

Yeah, character class (square brackets) doesn't work like this. 是的,字符类(方括号)不能像这样工作。 You apply quantifiers on the class itself, not on its characters. 您对类本身应用量词,而不是对其字符应用。 You could use | 你可以使用| operator instead. 而不是运营商 Something like this: 像这样的东西:

.split(%r[\s+|,\s*|x])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM