匹配网址中的特定正则表达式单词

Question

I must admit I've never gotten used to using regex, however recently I ran into a problem where the work around would've been more of a pain than using regex. 我必须承认，我从未习惯过使用正则表达式，但是最近我遇到了一个问题，即解决问题比使用正则表达式更痛苦。 I need to be able to match anything that follows the following pattern at the beginning of a string: {any_url_safe_word} +( "/http://" || "/https://" || "www." ) + {any word} . 我需要能够在字符串的开头匹配遵循以下模式的所有内容： {any_url_safe_word} +（ "/http://" || "/https://" || "www." ）+ {any word} 。 So the following should match: 因此，以下应匹配：

cars/http://google.com#test
cars/https://google.com#test
cars/www.google.com#test

The follwing shouldn't match: 跟随者不应匹配：

cars/httdp://google.com#test
cars/http:/google.com#test

What I tried so far is: ^[\\w]{1,500}\\/[(http\\:\\/\\/)|(https:\\/\\/])|([www\\.])]{0,50} , but that matches cars/http from cars/httpd://google.com . 到目前为止，我尝试过的是： ^[\\w]{1,500}\\/[(http\\:\\/\\/)|(https:\\/\\/])|([www\\.])]{0,50} ，但与cars/httpd://google.com cars/http匹配。

Answer 1

This regex could do: 这个正则表达式可以做到：

^[\w\d]+\/(?:https?:\/\/)?(?:www\.)?[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}

And if you want to get everything that comes after it, you can just add (.*) to the end... 而且，如果您想获得其后的所有内容，则只需在最后添加(.*) ...

Live DEMO

在此处输入图片说明

And since it seems that the more or less general list of URL safe words contains ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~:/?#[]@!$&'()*+,;= Source , you may include that too, so you'll get (after simplification): 而且，由于似乎URL安全字的一般列表似乎包含ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~:/?#[]@!$&'()*+,;= Source ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~:/?#[]@!$&'()*+,;= ，所以您也可以包括该名称，因此， ll得到（简化后）：

^[!#$&-.0-;=?-\[\]_a-z~]+\/(?:https?:\/\/)?(?:www\.)?[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}

Answer 2

<?php
$words = array(
    'cars/http://google.com#test',
    'cars/https://google.com#test',
    'cars/www.google.com#test',
    'cars/httdp://google.com#test',
    'cars/http:/google.com#test',
    'c a r s/http:/google.com#test'
    );

foreach($words as $value)
{
    /*
      \S+           - at least one non-space symbol
      \/            - slash
      (https?:\/\/) - http with possible s then ://
      |             - or
      (www\.)       - www.
      .+            - at least one symbol
     */
    if (preg_match('/^\S+\/(https?:\/\/)|(www\.).+/', $value))
    {
        print $value. " good\n";
    }
    else
    {
        print $value. " bad\n";
    }
}

Prints: 打印：

cars/http://google.com#test good
cars/https://google.com#test good
cars/www.google.com#test good
cars/httdp://google.com#test bad
cars/http:/google.com#test bad
c a r s/http:/google.com#test bad

Answer 3

Check out the demo . 查看演示。

[a-z0-9-_.~]+/(https?://|www\\.)[a-z0-9]+\\.[az]{2,6}([/?#a-z0-9-_.~])*

Edit: taken @CD001 comment into account. 编辑：考虑到@ CD001注释。 Be sure to use the i modifier if you don't mind case-sensitivity. 如果您不区分大小写，请务必使用i修饰符。

匹配网址中的特定正则表达式单词

问题描述

3 个解决方案

解决方案1
3 2013-11-25 15:52:15

解决方案2
0 2013-11-25 15:52:05

解决方案3
0 2013-11-25 16:06:14

匹配网址中的特定正则表达式单词

问题描述

3 个解决方案

解决方案1 3 2013-11-25 15:52:15

解决方案2 0 2013-11-25 15:52:05

解决方案3 0 2013-11-25 16:06:14

解决方案1
3 2013-11-25 15:52:15

解决方案2
0 2013-11-25 15:52:05

解决方案3
0 2013-11-25 16:06:14