正则表达式-匹配字符，但不包含在结果中

Question

I have got the following Regex, which ALMOST works... 我有以下正则表达式，ALMOST可以工作...

(?:^https?:\/\/)(?:www|[a-z]+)\.([^.]+)

I need the result to be the only result, or within the same position in the Array. 我需要结果是唯一的结果，或者在数组中的相同位置。

So for example this http://m.facebook.com/ matches perfect, there is only 1 group. 因此，例如，此http://m.facebook.com/匹配完美，只有1组。

However, if I change it to http://facebook.com/ then I get com/ in place of where Facebook should be. 但是，如果我将其更改为http://facebook.com/则将com/替换为Facebook应该位于的位置。 So I need to have (?:www|[az]+) as an optional check really. 因此，我确实需要(?:www|[az]+)作为可选检查。

Edit: 编辑：

What I expect is just to match facebook , if ANY of the strings are as follows: 我期望的只是匹配facebook ，如果任何字符串如下：

http://www.facebook.com http://www.facebook.com

http://facebook.com http://facebook.com

http://m.facebook.com http://m.facebook.com

And obviously the https counterparts. 显然与https对应。

This is my Regex now 这是我的正则表达式

(?:^https?:\/\/)(?:www)?\.?([^.]+)

This is close, however it matches the m on when I try ` http://m.facebook.com 这很近，但是当我尝试` http://m.facebook.com时，它与m匹配

https://regex101.com/r/GDapY5/1 https://regex101.com/r/GDapY5/1

Answer 1

So I need to have (?:www|[az]+) as an optional check really. 因此，我确实需要（？：www | [az] +）作为可选检查。

A ? 一个? at the end of a pattern is generally used for "optional" bits -- it means "match zero or one" of that thing, so your subpattern would be something like this: 模式结尾处通常用于“可选”位-表示该东西“匹配零或一”，因此您的子模式将如下所示：

(?:www|[a-z]+)?

If you're simply trying to get the second level domain, I wouldn't bother with regex, because you'll be constantly adjusting it to handle special cases you come across. 如果您只是想获得二级域名，那么我不会理会regex，因为您将不断对其进行调整以处理遇到的特殊情况。 Just split on dots and take the penultimate value: 只需将点分开并获得倒数第二个值：

$domain = array_reverse(explode('.', parse_url($str)['host']))[1];

Or: 要么：

$domain = array_reverse(explode('.', parse_url($str, PHP_URL_HOST)))[1];

Answer 2

Perhaps you could make the first m. 也许你可以开第一个m. part optional with (?:\\w+\\.)? 用(?:\\w+\\.)?可选的部分(?:\\w+\\.)? . 。 Instead of a capturing group you could use \\K to reset the starting point of the reported match. 除了捕获组，您可以使用\\K重置所报告比赛的起点。

Then match one or more word characters \\w+ and use a positive lookahead to assert that what follows is a dot (?=\\.) 然后匹配一个或多个单词字符\\w+并使用正向先行断言其后是一个点(?=\\.)

For example: 例如：

^https?://(?:www)?(?:\\w+\\.)?\\K\\w+(?=\\.)

Edit: Or you could match for m. 编辑：或者您可以匹配m. or www. 或www. using an alternation: 使用交替：

^https?://(?:m\\.|www\\.)?\\K\\w+(?=\\.)

Demo Php 演示版

正则表达式-匹配字符，但不包含在结果中

问题描述

2 个解决方案

解决方案1
2 2018-04-13 15:12:33

解决方案2
1 已采纳 2018-04-13 15:37:33

正则表达式-匹配字符，但不包含在结果中

问题描述

2 个解决方案

解决方案1 2 2018-04-13 15:12:33

解决方案2 1 已采纳 2018-04-13 15:37:33

解决方案1
2 2018-04-13 15:12:33

解决方案2
1 已采纳 2018-04-13 15:37:33