简体   繁体   English

Ruby中的正则表达式,用于字符串中的多个拆分

[英]Regex in Ruby for multiple splits in a string

I need some advice perfecting a regex. 我需要一些完善正则表达式的建议。 I'm trying to split a string into three pieces with a single expression. 我试图用一个表达式将一个字符串拆分成三个部分。 Lines come from a text file in a format like so: 行来自文本文件,格式如下:

25 red delicious apples at 0.75 

where the first part is the quantity, the second is the item name, and the third is the price per item. 第一部分是数量,第二部分是项目名称,第三部分是每个项目的价格。 The code I'm using is this: 我正在使用的代码是这样的:

File.open('basket.txt').each_line do |line|
  item = line.split(/(\d+)\s|\sat\s/, 3)

This splits a string where I want it, but it creates an item array with length four (the first index contains nil ). 这会将字符串拆分到我想要的位置,但它会创建一个长度为4的项数组(第一个索引包含nil )。 I also want to get rid of the newline character at the end of the float. 我还想摆脱浮动末尾的换行符。

You can try this: 你可以试试这个:

txt = "25 red delicious apples 0.75"
pattern = Regexp.new('(?<=\d)\s|\s(?=\d)')
puts txt.split(pattern)

or with irb: 或使用irb:

'25 red delicious apples 0.75'.split(/(?<=\d)\s|\s(?=\d)/)

with "at": 与“在”:

'25 red delicious apples at 0.75'.split(/(?<=\d)\s|\sat\s(?=\d)/)

An example with your loop: 循环示例:

pattern = Regexp.new('(?<=\d)\s|\sat\s(?=\d)')
File.open('basket.txt').each_line do |line|
  items = line.split(pattern)
end

I would use match instead of split for this task. 我会使用匹配而不是拆分来完成此任务。 This way you will be able to get the groups more accurately. 这样您就可以更准确地获得群组。 For instance if we assume there are no numbers in the name of the product: 例如,如果我们假设产品名称中没有数字:

s = "25 red delicious apples 0.75"
m = s.match(/(\d+) ([^\d.]+) ([\d.]+)/)
m[1]
=> "25"
m[2]
=> "red delicious apples"
m[3]
=> "0.75"

In this case, you should use a pattern matching instead of split . 在这种情况下,您应该使用模式匹配而不是split

line = "25 red delicious apples at 0.75\n"
line.match(/(\d+)\s+(.*)\s+at\s+(\S+)/).values_at(1, 2, 3)
# => ["25", "red delicious apples", "0.75"]
p "25 red delicious apples 0.75".partition(/[\D\s]+/)
#=> ["25", " red delicious apples ", "0.75"]
'25 red delicious apples at 0.75'.scan(/[0-9]+\.?\d*/) #=> ["25", "0.75"]

How about: 怎么样:

'25 red delicious apples at 0.75'.scan /(\d+[.\d]+) (.*) at (\d+[.\d]+)/
#=> [["25", "red delicious apples", "0.75"]]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM