从URL字符串中提取多种模式

Question

I have a URL string: 我有一个URL字符串：

http://localhost:3000/user/event?profile_id=2&profile_type=UserProfile

I want to extract "2" and "UserProfile", where these can change. 我想提取“ 2”和“ UserProfile”，这些可以更改。

I tried to use both match and scan but neither is returning results: 我尝试同时使用match和scan但均未返回结果：

url = "http://localhost:3000/user/event?profile_id=2&profile_type=UserProfile"
m = /http(s)?:\/\/(.)+\/user\/event?profile_id=(\d)&profile_type=(\w)/.match(url)
=> nil 

url.scan /http(s)?:\/\/(.)+\/user\/event?profile_id=(\d)&profile_type=(\w)/
=> []

Any idea what I might be doing wrong? 知道我做错了什么吗？

Answer 1

Don't use a pattern to try to do this. 不要使用模式来尝试执行此操作。 URL ordering of the query parameters can change, and isn't position dependent, which would instantly break a pattern. 查询参数的URL顺序可以更改，并且与位置无关，这会立即破坏模式。

Instead, use a tool designed for the purpose, like the built-in URI : 而是使用为此目的而设计的工具，例如内置URI ：

require 'uri'

uri = URI.parse('http://localhost:3000/user/event?profile_id=2&profile_type=UserProfile')

Hash[URI::decode_www_form(uri.query)].values_at('profile_id', 'profile_type') 
# => ["2", "UserProfile"]

By doing it that way you are guaranteed to always receive the right value in the expected order, making it easy to assign them: 通过这种方式，可以确保始终以期望的顺序接收正确的值，从而轻松分配它们：

profile_id, profile_type = Hash[URI::decode_www_form(uri.query)].values_at('profile_id', 'profile_type')

Here are the intermediate steps so you can see what's happening: 以下是中间步骤，因此您可以看到发生了什么：

uri.query # => "profile_id=2&profile_type=UserProfile"
URI::decode_www_form(uri.query) # => [["profile_id", "2"], ["profile_type", "UserProfile"]]
Hash[URI::decode_www_form(uri.query)] # => {"profile_id"=>"2", "profile_type"=>"UserProfile"}

Answer 2

match = url.match(/https?:\/\/.+?\/user\/event\?profile_id=(\d)&profile_type=(\w+)/)
p match.captures[0] #=> '2'
p match.captures[1] #=> 'UserProfile'

In your expression: 在您的表情中：

/http(s)?:\/\/(.)+\/user\/event?profile_id=(\d)&profile_type=(\w)/

EVERYTHING you put inside () is captured in a regular expression. 您放入（）中的所有内容都会以正则表达式捕获。 There's no need to put the s in parentheses because ? 无需将s放在括号中，因为？ will act only on the preceding character. 将仅对前一个字符起作用。 Also, there's no need for the (.) because, again, the + will act only on the preceding character. 另外，也不需要（。），因为+只会作用于前一个字符。 Also, (\\w) should be (\\w+) which basically says: One or more characters (and 'UserProfile' is 1 or more characters. 同样，（\\ w）应该是（\\ w +），它基本上表示：一个或多个字符（“ UserProfile”为1个或多个字符。

从URL字符串中提取多种模式

问题描述

2 个解决方案

解决方案1
2 已采纳 2014-10-03 20:15:30

解决方案2
1 2014-10-03 19:49:10

从URL字符串中提取多种模式

问题描述

2 个解决方案

解决方案1 2 已采纳 2014-10-03 20:15:30

解决方案2 1 2014-10-03 19:49:10

解决方案1
2 已采纳 2014-10-03 20:15:30

解决方案2
1 2014-10-03 19:49:10