[英]Regex strings in Ruby
Input strings: 输入字符串:
str1 = "$13.90 Price as Shown"
str2 = "$590.50 $490.00 Price as Selected"
str3 = "$9.90 or 5/$27.50 Price as Selected"
Output strings: 输出字符串:
str1 = "13.90"
str2 = "490.00"
str3 = "9.90"
My code to get output: 我的代码得到输出:
str = str.strip.gsub(/\s\w{2}\s\d\/\W\d+.\d+/, "") # remove or 5/$27.50 from string
str = /\W\d+.\d+\s\w+/.match(str).to_s.gsub("$", "").gsub(" Price", "")
This code works fine for all 3 different types of strings. 该代码对所有3种不同类型的字符串都适用。 But how can I improve my code? 但是如何改善我的代码? Are there any better solutions? 有更好的解决方案吗? Also guys can you give link to good regex guide/book? 伙计们,您可以给良好的正则表达式指南/书链接吗?
A regex I suggested first is just a sum total of your regexps: 我首先建议的正则表达式只是您的正则表达式的总和:
(?<=(?<!\/)\$)\d+.\d+(?=\s\w+)
Since it is next to impossible to compare numbers with regex, I suggest 由于用正则表达式比较数字几乎是不可能的,我建议
Here is a working snippet : 这是一个工作片段 :
def getLowestNumberFromString(input)
arr = input.scan(/(?<=(?<!\/)\$)\d+(?:\.\d+)?/)
arr.collect do |value|
value.to_f
end
return arr.min
end
puts getLowestNumberFromString("$13.90 Price as Shown")
puts getLowestNumberFromString("$590.50 $490.00 Price as Selected")
puts getLowestNumberFromString("$9.90 or 5/$27.50 Price as Selected")
The regex breakdown: 正则表达式细分:
(?<=(?<!\\/)\\$)
- assert that there is a $
symbol not preceded with /
right before... (?<=(?<!\\/)\\$)
-断言,有一个$
不是前面有符号/
右前... \\d+
- 1 or more digits \\d+
-1个或更多数字 (?:\\.\\d+)?
- optionally followed with a .
-(可选)后跟一个.
followed by 1 or more digits 后跟1个或多个数字 Note that if you only need to match floats with decimal part, remove the ?
请注意,如果只需要将浮点数与小数部分匹配,请删除?
and non-capturing group from the last subpattern ( /(?<=(?<!\\/)\\$)\\d+\\.\\d+/
or even /(?<=(?<!\\/)\\$)\\d*\\.?\\d+/
). 和最后一个子模式中的非捕获组( /(?<=(?<!\\/)\\$)\\d+\\.\\d+/
甚至/(?<=(?<!\\/)\\$)\\d*\\.?\\d+/
)。
Supposing input can be relied upon to look like one of your three examples, how about this? 假设可以依靠输入看起来像您的三个示例之一,那么呢?
expr = /\$(\d+\.\d\d)\s+(?:or\s+\d+\/\$\d+\.\d\d\s+)?Price/
str = "$9.90 or 5/$27.50 Price as Selected"
str[expr, 1] # => "9.90"
Here it is on Rubular: http://rubular.com/r/CakoUt5Lo3 它在Rubular上: http: //rubular.com/r/CakoUt5Lo3
Explained: 解释:
expr = %r{
\$ # literal dollar sign
(\d+\.\d\d) # capture a price with two decimal places (assume no thousands separator)
\s+ # whitespace
(?: # non-capturing group
or\s+ # literal "or" followed by whitespace
\d+\/ # one or more digits followed by literal "/"
\$\d+\.\d\d # dollar sign and price
\s+ # whitespace
)? # preceding group is optional
Price # the literal word "Price"
}x
You might use it like this: 您可以这样使用它:
MATCH_PRICE_EXPR = /\$(\d+\.\d\d)\s+(?:or\s+\d+\/\$\d+\.\d\d\s+)?Price/
def match_price(input)
return unless input =~ MATCH_PRICE_EXPR
$1.to_f
end
puts match_price("$13.90 Price as Shown")
# => 13.9
puts match_price("$590.50 $490.00 Price as Selected")
# => 490.0
puts match_price("$9.90 or 5/$27.50 Price as Selected")
# => 9.9
My code works fine for all 3 types of strings. 我的代码对所有3种类型的字符串都适用。 Just wondering how can I improve that code 只是想知道如何改善该代码
str = str.gsub(/ or \d\/[\$\d.]+/i, '')
str = /(\$[\d.]+) P/.match(str)
Ruby Live Demo Ruby Live演示
Assuming you simply want the smallest dollar value in each line: 假设您只是希望每行中的美元价值最小:
r = /
\$ # match a dollar sign
\d+ # match one or more digits
\. # match a decimal point
\d{2} # match two digits
/x # extended mode
[str1, str2, str3].map { |s| s.scan(r).min_by { |s| s[1..-1].to_f } }
#=> ["$13.90", "$490.00", "$9.90"]
Actually, you don't have to use a regex. 实际上,您不必使用正则表达式。 You could do it like this: 您可以这样做:
def smallest(str)
val = str.each_char.with_index(1).
select { |c,_| c == ?$ }.
map { |_,i| str[i..-1].to_f }.
min
"$%.2f" % val
end
smallest(str1) #=> "$13.90"
smallest(str2) #=> "$490.00"
smallest(str3) #=> "$9.90"
A better regex is probably: /\\B\\$(\\d+\\.\\d{2})\\b/
更好的正则表达式可能是:/ /\\B\\$(\\d+\\.\\d{2})\\b/
str = "$590.50 $490.00 Price as Selected"
str.scan(/\B\$(\d+\.\d{2})\b/).flatten.min_by(&:to_f)
#=> "490.00"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.