[英]How to regex the strings in an url
http://something.com/bOhxBeD,SyhyTGi,TMDDSIB,U72gx2J,kQTIRy9,7VXgGDw,eSxIcK6,S5oNlnn,WBHHsLk,BdMGd2d,U9kNlsF,cHVyc7Y,D83kaJ5,cLWgdSO,iWtCIF3,ount8L6
I have tried to get the value: bOhxBeD, SyhyTGi and so on. 我尝试获取该值:bOhxBeD,SyhyTGi等。 This is what I come up with ( yes fairly simple )
/([a-zA-Z0-9]{7})/
, it seems to work with PCRE: 这是我想出的(是相当简单的)
/([a-zA-Z0-9]{7})/
,它似乎可以与PCRE一起使用:
([a-zA-Z0-9]{7})
But when it comes to Ruby, I use it like this : 但是当涉及到Ruby时,我是这样使用的:
str.match(/([a-zA-Z0-9]{7})/)
#<MatchData "bOhxBeD" 1:"bOhxBeD">
it doesn't seem to work. 它似乎不起作用。 Can anyone point out what's wrong with this regex ?
谁能指出这个正则表达式有什么问题吗? Thanks
谢谢
You need to add word boundary \\b
inorder to match an exact 7 alphanumeric characters. 您需要添加单词边界
\\b
才能匹配确切的7个字母数字字符。
\b[a-zA-Z0-9]{7}\b
irb(main):006:0> "http://something.com/bOhxBeD,SyhyTGi,TMDDSIB,U72gx2J,kQTIRy9,7VXgGDw,eSxIcK6,S5oNlnn,WBHHsLk,BdMGd2d,U9kNlsF,cHVyc7Y,D83kaJ5,cLWgdSO,iWtCIF3,ount8L6".scan(/\b([a-zA-Z0-9]{7})\b/)
=> [["bOhxBeD"], ["SyhyTGi"], ["TMDDSIB"], ["U72gx2J"], ["kQTIRy9"], ["7VXgGDw"], ["eSxIcK6"], ["S5oNlnn"], ["WBHHsLk"], ["BdMGd2d"], ["U9kNlsF"], ["cHVyc7Y"], ["D83kaJ5"], ["cLWgdSO"], ["iWtCIF3"], ["ount8L6"]]
(?!.*?\/)[a-zA-Z0-9]{7}
应该是这个。否则它将从链接中也选择7个字母词。“ somethi”将在ans中。但是我想这不是必需的。
match
only picks up the first match. match
只会拿到第一场比赛。
You can try the global version of match
which is scan
. 您可以尝试的全球版本
match
是scan
。
You can use scan
to search string not containing specific characters using [^...]
: 您可以使用
[^...]
使用scan
搜索不包含特定字符的字符串:
str.scan(/[^\/\.\,]+/)[3..-1]
#=> ["bOhxBeD", "SyhyTGi", "TMDDSIB", "U72gx2J", "kQTIRy9", "7VXgGDw", "eSxIcK6", "S5oNlnn", "WBHHsLk", "BdMGd2d", "U9kNlsF", "cHVyc7Y", "D83kaJ5", "cLWgdSO", "iWtCIF3", "ount8L6"]
Update: 更新:
If you know that the strings between the comma are always 7 characters, you can use this instead: 如果您知道逗号之间的字符串始终为7个字符,则可以改用以下字符:
str.scan(/[^\/\.\,]{7}/)[1..-1]
发生这种情况是因为您的正则表达式仅匹配一个包含7个字符的元素,仅此而已,因为简单的解决方案可能是:
str.match(/\/(.*)\z/)[1].split(',')
You could use String#[] and String#split : 您可以使用String#[]和String#split :
str[/.*\/(.*)/,1].split(',')
#=> ["bOhxBeD", "SyhyTGi", "TMDDSIB", "U72gx2J", "kQTIRy9", "7VXgGDw",
# "eSxIcK6", "S5oNlnn", "WBHHsLk", "BdMGd2d", "U9kNlsF", "cHVyc7Y",
# "D83kaJ5", "cLWgdSO", "iWtCIF3", "ount8L6"]
.*\\/
in the regex, "greedy" as it is, will consume characters up to and including the last forward slash in the string. 正则表达式中的
.*\\/
“贪婪”,它将消耗字符串中最后一个正斜杠(包括该斜杠)之前的字符。 Capture group #1 (.*)
sucks up the remainder of the string and, due to the presence of ,1
, returns it. 捕获组#1
(.*)
吸收字符串的其余部分,由于存在,1
,将其返回。 split(',')
then breaks up the string to give you the desired array. 然后
split(',')
分解字符串以提供所需的数组。
Another way: 其他方式:
str[str[/.*\//].size..-1].split(',')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.