[英]Regexp to match repeated substring
I would like to verify a string containing repeated substrings. 我想验证包含重复子字符串的字符串。 The substrings have a particular structure.
子字符串具有特定的结构。 Whole string has a particular structure (substring split by
"|"
). 整个字符串具有特定的结构(子字符串用
"|"
分隔)。 For instance, the string can be: 例如,字符串可以是:
1=23.00|6=22.12|12=21.34|112=20.34
1=23.00|6=22.12|12=21.34
1=23.00|12=21.34
1=23.00**
How can I check that all repeated substrings match a regexp? 如何检查所有重复的子字符串是否与正则表达式匹配? I tried to check it with:
我试图用以下方法检查它:
"1=23.00|6=22.12|12=21.34".match(/([1-9][0-9]*[=][0-9\.]+)+/)
But checking gives true
even when several substrings do not match the regexp: 但是,即使几个子字符串与regexp不匹配,检查也为
true
:
"1=23.00|6=ass|=21.34".match(/([1-9][0-9]*[=][0-9\.]+)+/)
# => #<MatchData "1=23.00" 1:"1=23.00">
This will return true
if there are any duplicates, false
if there are not: 如果有重复项,则返回
true
否则,则返回false
:
s = "1=23.00|6=22.12|12=21.34|112=20.34|3=23.00"
arr = s.split(/\|/).map { |s| s.gsub(/\d=/, "") }
arr != arr.uniq # => true
The question is whether every repeated substring matches a regex. 问题是每个重复的子字符串是否都匹配一个正则表达式。 I understand that the substrings are separated by the character
|
我知道子字符串由字符
|
分隔|
or $/
, the latter being the end of a line. 或
$/
,后者是一行的结尾。 We first need to obtain the repeated substrings: 我们首先需要获得重复的子字符串:
a = str.split(/[#{$/}\|]/)
.map(&:strip)
.group_by {|s| s}
.select {|_,v| v.size > 1 }
.keys
Next we specify whatever regex you wish to use. 接下来,我们指定您要使用的任何正则表达式。 I am assuming it is this:
我假设是这样的:
REGEX = /[1-9][0-9]*=[1-9]+\.[0-9]+/
but it could be altered if you have other requirements. 但是如果您有其他要求,可以更改它。
As we wish to determine if all repeated substrings match the regex, that is simply: 正如我们希望确定是否所有重复的子字符串都与正则表达式匹配,这很简单:
a.all? {|s| s =~ REGEX}
Here are the calculations: 计算如下:
str =<<_
1=23.00|6=22.12|12=21.34|112=20.34
1=23.00|6=22.12|12=21.34
1=23.00|12=21.34
1=23.00**
_
c = str.split(/[#{$/}\|]/)
#=> ["1=23.00", "6=22.12", "12=21.34", "112=20.34", "1=23.00",
# "6=22.12", "12=21.34", "1=23.00", "12=21.34", "1=23.00**"]
d = c.map(&:strip)
# same as c, possibly not needed or not wanted
e = d.group_by {|s| s}
# => {"1=23.00" =>["1=23.00", "1=23.00", "1=23.00"],
# "6=22.12" =>["6=22.12", "6=22.12"],
# "12=21.34" =>["12=21.34", "12=21.34", "12=21.34"],
# "112=20.34"=>["112=20.34"], "1=23.00**"=>["1=23.00**"]}
f = e.select {|_,v| v.size > 1 }
#=> {"1=23.00"=>["1=23.00", "1=23.00" , "1=23.00"],
# "6=22.12"=>["6=22.12", "6=22.12"],
# "12=21.34"=>["12=21.34", "12=21.34", "12=21.34"]}
a = f.keys
#=> ["1=23.00", "6=22.12", "12=21.34"]
a.all? {|s| s =~ REGEX}
#=> true
If you want to resolve it through regexp (not ruby), you should match whole string, not substrings. 如果要通过正则表达式(不是ruby)解析它,则应匹配整个字符串,而不是子字符串。 Well, I added [|] symbol and line ending to your regexp and it should works like you want.
好吧,我在您的正则表达式中添加了[|]符号和行结尾,它应该可以像您想要的那样工作。
([1-9][0-9]*[=][0-9\.]+[|]*)+$
Try it out. 试试看。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.