正则表达式以匹配重复的子字符串

Question

I would like to verify a string containing repeated substrings. 我想验证包含重复子字符串的字符串。 The substrings have a particular structure. 子字符串具有特定的结构。 Whole string has a particular structure (substring split by "|" ). 整个字符串具有特定的结构（子字符串用"|"分隔）。 For instance, the string can be: 例如，字符串可以是：

1=23.00|6=22.12|12=21.34|112=20.34
1=23.00|6=22.12|12=21.34
1=23.00|12=21.34
1=23.00**

How can I check that all repeated substrings match a regexp? 如何检查所有重复的子字符串是否与正则表达式匹配？ I tried to check it with: 我试图用以下方法检查它：

"1=23.00|6=22.12|12=21.34".match(/([1-9][0-9]*[=][0-9\.]+)+/)

But checking gives true even when several substrings do not match the regexp: 但是，即使几个子字符串与regexp不匹配，检查也为true ：

"1=23.00|6=ass|=21.34".match(/([1-9][0-9]*[=][0-9\.]+)+/)
# => #<MatchData "1=23.00" 1:"1=23.00">

Answer 1

This will return true if there are any duplicates, false if there are not: 如果有重复项，则返回true否则，则返回false ：

s = "1=23.00|6=22.12|12=21.34|112=20.34|3=23.00"
arr = s.split(/\|/).map { |s| s.gsub(/\d=/, "") }

arr != arr.uniq # => true

Answer 2

The question is whether every repeated substring matches a regex. 问题是每个重复的子字符串是否都匹配一个正则表达式。 I understand that the substrings are separated by the character | 我知道子字符串由字符|分隔| or $/ , the latter being the end of a line. 或$/ ，后者是一行的结尾。 We first need to obtain the repeated substrings: 我们首先需要获得重复的子字符串：

a = str.split(/[#{$/}\|]/)
       .map(&:strip)
       .group_by {|s| s}
       .select {|_,v| v.size > 1 }
       .keys

Next we specify whatever regex you wish to use. 接下来，我们指定您要使用的任何正则表达式。 I am assuming it is this: 我假设是这样的：

REGEX = /[1-9][0-9]*=[1-9]+\.[0-9]+/

but it could be altered if you have other requirements. 但是如果您有其他要求，可以更改它。

As we wish to determine if all repeated substrings match the regex, that is simply: 正如我们希望确定是否所有重复的子字符串都与正则表达式匹配，这很简单：

a.all? {|s| s =~ REGEX}

Here are the calculations: 计算如下：

str =<<_
1=23.00|6=22.12|12=21.34|112=20.34
1=23.00|6=22.12|12=21.34
1=23.00|12=21.34
1=23.00**
_
c = str.split(/[#{$/}\|]/)
  #=> ["1=23.00", "6=22.12", "12=21.34", "112=20.34", "1=23.00",
  #    "6=22.12", "12=21.34", "1=23.00", "12=21.34", "1=23.00**"] 
d = c.map(&:strip)
  # same as c, possibly not needed or not wanted
e = d.group_by {|s| s}
  # => {"1=23.00"  =>["1=23.00", "1=23.00", "1=23.00"],
  #     "6=22.12"  =>["6=22.12", "6=22.12"],
  #     "12=21.34" =>["12=21.34", "12=21.34", "12=21.34"],
  #     "112=20.34"=>["112=20.34"], "1=23.00**"=>["1=23.00**"]} 
f = e.select {|_,v| v.size > 1 }
  #=> {"1=23.00"=>["1=23.00",  "1=23.00" ,  "1=23.00"],
  #    "6=22.12"=>["6=22.12",  "6=22.12"],
  #   "12=21.34"=>["12=21.34", "12=21.34", "12=21.34"]} 
a = f.keys
  #=> ["1=23.00", "6=22.12", "12=21.34"] 
a.all? {|s| s =~ REGEX}
  #=> true

Answer 3

If you want to resolve it through regexp (not ruby), you should match whole string, not substrings. 如果要通过正则表达式（不是ruby）解析它，则应匹配整个字符串，而不是子字符串。 Well, I added [|] symbol and line ending to your regexp and it should works like you want. 好吧，我在您的正则表达式中添加了[|]符号和行结尾，它应该可以像您想要的那样工作。

([1-9][0-9]*[=][0-9\.]+[|]*)+$

Try it out. 试试看。

正则表达式以匹配重复的子字符串

问题描述

3 个解决方案

解决方案1
1 2014-02-14 08:04:26

解决方案2
1 已采纳 2014-02-14 08:19:30

解决方案3
0 2014-02-14 07:57:09

正则表达式以匹配重复的子字符串

问题描述

3 个解决方案

解决方案1 1 2014-02-14 08:04:26

解决方案2 1 已采纳 2014-02-14 08:19:30

解决方案3 0 2014-02-14 07:57:09

解决方案1
1 2014-02-14 08:04:26

解决方案2
1 已采纳 2014-02-14 08:19:30

解决方案3
0 2014-02-14 07:57:09