如何在网址中对字符串进行正则表达式

Question

http://something.com/bOhxBeD,SyhyTGi,TMDDSIB,U72gx2J,kQTIRy9,7VXgGDw,eSxIcK6,S5oNlnn,WBHHsLk,BdMGd2d,U9kNlsF,cHVyc7Y,D83kaJ5,cLWgdSO,iWtCIF3,ount8L6

I have tried to get the value: bOhxBeD, SyhyTGi and so on. 我尝试获取该值：bOhxBeD，SyhyTGi等。 This is what I come up with ( yes fairly simple ) /([a-zA-Z0-9]{7})/ , it seems to work with PCRE: 这是我想出的（是相当简单的） /([a-zA-Z0-9]{7})/ ，它似乎可以与PCRE一起使用：

([a-zA-Z0-9]{7})

正则表达式可视化

Debuggex Demo Debuggex演示

But when it comes to Ruby, I use it like this : 但是当涉及到Ruby时，我是这样使用的：

str.match(/([a-zA-Z0-9]{7})/)
#<MatchData "bOhxBeD" 1:"bOhxBeD">

it doesn't seem to work. 它似乎不起作用。 Can anyone point out what's wrong with this regex ? 谁能指出这个正则表达式有什么问题吗？ Thanks 谢谢

Answer 1

You need to add word boundary \\b inorder to match an exact 7 alphanumeric characters. 您需要添加单词边界\\b才能匹配确切的7个字母数字字符。

\b[a-zA-Z0-9]{7}\b

DEMO 演示

irb(main):006:0> "http://something.com/bOhxBeD,SyhyTGi,TMDDSIB,U72gx2J,kQTIRy9,7VXgGDw,eSxIcK6,S5oNlnn,WBHHsLk,BdMGd2d,U9kNlsF,cHVyc7Y,D83kaJ5,cLWgdSO,iWtCIF3,ount8L6".scan(/\b([a-zA-Z0-9]{7})\b/)
=> [["bOhxBeD"], ["SyhyTGi"], ["TMDDSIB"], ["U72gx2J"], ["kQTIRy9"], ["7VXgGDw"], ["eSxIcK6"], ["S5oNlnn"], ["WBHHsLk"], ["BdMGd2d"], ["U9kNlsF"], ["cHVyc7Y"], ["D83kaJ5"], ["cLWgdSO"], ["iWtCIF3"], ["ount8L6"]]

Answer 2

 (?!.*?\/)[a-zA-Z0-9]{7}

应该是这个。否则它将从链接中也选择7个字母词。“ somethi”将在ans中。但是我想这不是必需的。

Answer 3

match only picks up the first match. match只会拿到第一场比赛。
You can try the global version of match which is scan . 您可以尝试的全球版本match是scan 。
You can use scan to search string not containing specific characters using [^...] : 您可以使用[^...]使用scan搜索不包含特定字符的字符串：

str.scan(/[^\/\.\,]+/)[3..-1]   
#=> ["bOhxBeD", "SyhyTGi", "TMDDSIB", "U72gx2J", "kQTIRy9", "7VXgGDw", "eSxIcK6", "S5oNlnn", "WBHHsLk", "BdMGd2d", "U9kNlsF", "cHVyc7Y", "D83kaJ5", "cLWgdSO", "iWtCIF3", "ount8L6"]

Update: 更新：
If you know that the strings between the comma are always 7 characters, you can use this instead: 如果您知道逗号之间的字符串始终为7个字符，则可以改用以下字符：

   str.scan(/[^\/\.\,]{7}/)[1..-1]

Answer 4

发生这种情况是因为您的正则表达式仅匹配一个包含7个字符的元素，仅此而已，因为简单的解决方案可能是：

str.match(/\/(.*)\z/)[1].split(',')

Answer 5

You could use String#[] and String#split : 您可以使用String＃[]和String＃split ：

str[/.*\/(.*)/,1].split(',')
  #=> ["bOhxBeD", "SyhyTGi", "TMDDSIB", "U72gx2J", "kQTIRy9", "7VXgGDw",
  #    "eSxIcK6", "S5oNlnn", "WBHHsLk", "BdMGd2d", "U9kNlsF", "cHVyc7Y",
  #    "D83kaJ5", "cLWgdSO", "iWtCIF3", "ount8L6"]

.*\\/ in the regex, "greedy" as it is, will consume characters up to and including the last forward slash in the string. 正则表达式中的.*\\/ “贪婪”，它将消耗字符串中最后一个正斜杠（包括该斜杠）之前的字符。 Capture group #1 (.*) sucks up the remainder of the string and, due to the presence of ,1 , returns it. 捕获组＃1 (.*)吸收字符串的其余部分，由于存在,1 ，将其返回。 split(',') then breaks up the string to give you the desired array. 然后split(',')分解字符串以提供所需的数组。

Another way: 其他方式：

str[str[/.*\//].size..-1].split(',')

如何在网址中对字符串进行正则表达式

问题描述

5 个解决方案

解决方案1
3 已采纳 2014-08-27 06:04:59

解决方案2
2 2014-08-27 05:49:17

解决方案3
2 2014-08-27 06:19:43

解决方案4
1 2014-08-27 05:58:37

解决方案5
1 2014-08-27 07:43:46

如何在网址中对字符串进行正则表达式

问题描述

5 个解决方案

解决方案1 3 已采纳 2014-08-27 06:04:59

解决方案2 2 2014-08-27 05:49:17

解决方案3 2 2014-08-27 06:19:43

解决方案4 1 2014-08-27 05:58:37

解决方案5 1 2014-08-27 07:43:46

解决方案1
3 已采纳 2014-08-27 06:04:59

解决方案2
2 2014-08-27 05:49:17

解决方案3
2 2014-08-27 06:19:43

解决方案4
1 2014-08-27 05:58:37

解决方案5
1 2014-08-27 07:43:46