[英]Matching dashes in a URL regex
I have used the following regex to get the urls from text (eg "this is text http://url.com/blabla possibly some more text"
). 我使用以下正则表达式从文本中获取网址(例如
"this is text http://url.com/blabla possibly some more text"
)。
'@(https?://([-\w\.]+)+(:\d+)?(/([\w/_\.]*(\?\S+)?)?)?)@'
This works for all URLs but I just found out it doesn't work for URLs shortened like: "blabla bla http://ff.im/-bEnA blabla"
becomes http://ff.im/
after the match. 这适用于所有URL,但我发现它不适用于缩短的URL,例如:
"blabla bla http://ff.im/-bEnA blabla"
在比赛后变为http://ff.im/
。
I suspect it has to do with the dash -
after the slash /
. 我怀疑它做的破折号
-
斜线后/
。
Short answer: [\\w/_\\.]
doesn't match -
so make it [-\\w/_\\.]
简短答案:
[\\w/_\\.]
不匹配-
因此将其设置为[-\\w/_\\.]
Long answer: 长答案:
@ - delimiter
( - start of group
https?:// - http:// or https://
([-\w.]+)+ - capture 1 or more hyphens, word characters or dots, 1 or more times.. this seems odd - don't know what the second + is for
(:\d+)? - optionally capture a : and some numbers (the port)
( - start of group
/ - leading slash
( - start of group
[\w/_\.] - any word character, underscore or dot - you need to add hyphen to this list or just make it [^?\S] - any char except ? or whitespace (the path + filename)
(\?\S+)? - optionally capture a ? followed by anything except whitespace (the querystring)
)? - close group, make it optional
)? - close group, make it optional
) - close group
@
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.