[英]Regex - Match characters but don't include within results
I have got the following Regex, which ALMOST works... 我有以下正则表达式,ALMOST可以工作...
(?:^https?:\/\/)(?:www|[a-z]+)\.([^.]+)
I need the result to be the only result, or within the same position in the Array. 我需要结果是唯一的结果,或者在数组中的相同位置。
So for example this http://m.facebook.com/
matches perfect, there is only 1 group. 因此,例如,此http://m.facebook.com/
匹配完美,只有1组。
However, if I change it to http://facebook.com/
then I get com/
in place of where Facebook should be. 但是,如果我将其更改为http://facebook.com/
则将com/
替换为Facebook应该位于的位置。 So I need to have (?:www|[az]+)
as an optional check really. 因此,我确实需要(?:www|[az]+)
作为可选检查。
Edit: 编辑:
What I expect is just to match facebook
, if ANY of the strings are as follows: 我期望的只是匹配facebook
,如果任何字符串如下:
http://www.facebook.com http://www.facebook.com
And obviously the https counterparts. 显然与https对应。
This is my Regex now 这是我的正则表达式
(?:^https?:\/\/)(?:www)?\.?([^.]+)
This is close, however it matches the m on when I try ` http://m.facebook.com 这很近,但是当我尝试` http://m.facebook.com时,它与m匹配
https://regex101.com/r/GDapY5/1 https://regex101.com/r/GDapY5/1
So I need to have (?:www|[az]+) as an optional check really. 因此,我确实需要(?:www | [az] +)作为可选检查。
A ?
一个?
at the end of a pattern is generally used for "optional" bits -- it means "match zero or one" of that thing, so your subpattern would be something like this: 模式结尾处通常用于“可选”位-表示该东西“匹配零或一”,因此您的子模式将如下所示:
(?:www|[a-z]+)?
If you're simply trying to get the second level domain, I wouldn't bother with regex, because you'll be constantly adjusting it to handle special cases you come across. 如果您只是想获得二级域名,那么我不会理会regex,因为您将不断对其进行调整以处理遇到的特殊情况。 Just split on dots and take the penultimate value: 只需将点分开并获得倒数第二个值:
$domain = array_reverse(explode('.', parse_url($str)['host']))[1];
Or: 要么:
$domain = array_reverse(explode('.', parse_url($str, PHP_URL_HOST)))[1];
Perhaps you could make the first m.
也许你可以开第一个m.
part optional with (?:\\w+\\.)?
用(?:\\w+\\.)?
可选的部分(?:\\w+\\.)?
. 。 Instead of a capturing group you could use \\K
to reset the starting point of the reported match. 除了捕获组,您可以使用\\K
重置所报告比赛的起点。
Then match one or more word characters \\w+
and use a positive lookahead to assert that what follows is a dot (?=\\.)
然后匹配一个或多个单词字符\\w+
并使用正向先行断言其后是一个点(?=\\.)
For example: 例如:
^https?://(?:www)?(?:\\w+\\.)?\\K\\w+(?=\\.)
Edit: Or you could match for m.
编辑:或者您可以匹配m.
or www.
或www.
using an alternation: 使用交替:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.