简体   繁体   English

如何在PHP中验证域名?

[英]How to validate domain name in PHP?

Is it possible without using regular expression?不使用正则表达式可以吗?

For example, I want to check that a string is a valid domain:例如,我想检查一个字符串是否是一个有效的域:

domain-name
abcd
example

Are valid domains.是有效的域。 These are invalid of course:这些当然是无效的:

domaia@name
ab$%cd

And so on.等等。 So basically it should start with an alphanumeric character, then there may be more alnum characters plus also a hyphen.所以基本上它应该以字母数字字符开头,然后可能会有更多的alnum字符加上一个连字符。 And it must end with an alnum character, too.它也必须以 anum 字符结尾。

If it's not possible, could you suggest me a regexp pattern to do this?如果不可能,你能建议我一个正则表达式模式来做到这一点吗?

EDIT:编辑:

Why doesn't this work?为什么这不起作用? Am I using preg_match incorrectly?我是否错误地使用了 preg_match?

$domain = '@djkal';
$regexp = '/^[a-zA-Z0-9][a-zA-Z0-9\-\_]+[a-zA-Z0-9]$/';
if (false === preg_match($regexp, $domain)) {
    throw new Exception('Domain invalid');
}
<?php
function is_valid_domain_name($domain_name)
{
    return (preg_match("/^([a-z\d](-*[a-z\d])*)(\.([a-z\d](-*[a-z\d])*))*$/i", $domain_name) //valid chars check
            && preg_match("/^.{1,253}$/", $domain_name) //overall length check
            && preg_match("/^[^\.]{1,63}(\.[^\.]{1,63})*$/", $domain_name)   ); //length of each label
}
?>

Test cases:测试用例:

is_valid_domain_name? [a]                       Y
is_valid_domain_name? [0]                       Y
is_valid_domain_name? [a.b]                     Y
is_valid_domain_name? [localhost]               Y
is_valid_domain_name? [google.com]              Y
is_valid_domain_name? [news.google.co.uk]       Y
is_valid_domain_name? [xn--fsqu00a.xn--0zwm56d] Y
is_valid_domain_name? [goo gle.com]             N
is_valid_domain_name? [google..com]             N
is_valid_domain_name? [google.com ]             N
is_valid_domain_name? [google-.com]             N
is_valid_domain_name? [.google.com]             N
is_valid_domain_name? [<script]                 N
is_valid_domain_name? [alert(]                  N
is_valid_domain_name? [.]                       N
is_valid_domain_name? [..]                      N
is_valid_domain_name? [ ]                       N
is_valid_domain_name? [-]                       N
is_valid_domain_name? []                        N

With this you will not only be checking if the domain has a valid format, but also if it is active / has an IP address assigned to it.有了这个,您不仅要检查域是否具有有效格式,还要检查它是否处于活动状态/是否分配了 IP 地址。

$domain = "stackoverflow.com";

if(filter_var(gethostbyname($domain), FILTER_VALIDATE_IP))
{
    return TRUE;
}

Note that this method requires the DNS entries to be active so if you require a domain string to be validated without being in the DNS use the regular expression method given by velcrow above.请注意,此方法要求 DNS 条目处于活动状态,因此如果您需要在不在 DNS 中的情况下验证域字符串,请使用上面由 velcrow 提供的正则表达式方法。

Also this function is not intended to validate a URL string use FILTER_VALIDATE_URL for that.此外,此函数不用于验证 URL 字符串,为此使用 FILTER_VALIDATE_URL。 We do not use FILTER_VALIDATE_URL for a domain because a domain string is not a valid URL.我们不对域使用 FILTER_VALIDATE_URL,因为域字符串不是有效的 URL。

PHP 7 PHP 7

// Validate a domain name
var_dump(filter_var('mandrill._domainkey.mailchimp.com', FILTER_VALIDATE_DOMAIN));
# string(33) "mandrill._domainkey.mailchimp.com"

// Validate an hostname (here, the underscore is invalid)
var_dump(filter_var('mandrill._domainkey.mailchimp.com', FILTER_VALIDATE_DOMAIN, FILTER_FLAG_HOSTNAME));
# bool(false)

It is not documented here: http://www.php.net/filter.filters.validate and a bug request for this is located here: https://bugs.php.net/bug.php?id=72013此处未记录: http ://www.php.net/filter.filters.validate,对此的错误请求位于此处: https : //bugs.php.net/bug.php? id =72013

use checkdnsrr http://php.net/manual/en/function.checkdnsrr.php使用checkdnsrr http://php.net/manual/en/function.checkdnsrr.php

$domain = "stackoverflow.com";

checkdnsrr($domain , "A");

//returns true if has a dns A record, false otherwise

Firstly, you should clarify whether you mean:首先,您应该澄清您的意思是:

  1. individual domain name labels个人域名标签
  2. entire domain names (ie multiple dot-separate labels)整个域名(即多个点分隔标签)
  3. host names主机名

The reason the distinction is necessary is that a label can technically include any characters, including the NUL, @ and ' .之所以有必要进行区分,是因为标签在技术上可以包含任何字符,包括 NUL、 @和 ' . ' characters. ' 字符。 DNS is 8-bit capable and it's perfectly possible to have a zone file containing an entry reading " an\\0odd\\.l@bel ". DNS 支持 8 位,并且完全有可能拥有一个包含条目为“ an\\0odd\\.l@bel ”的区域文件。 It's not recommended of course, not least because people would have difficulty telling a dot inside a label from those separating labels, but it is legal.当然不推荐这样做,尤其是因为人们很难从那些分隔标签中分辨出标签内的点,但这合法的。

However, URLs require a host name in them, and those are governed by RFCs 952 and 1123. Valid host names are a subset of domain names.但是, URL需要在其中包含主机名,并且这些受 RFC 952 和 1123 的约束。有效的主机名是域名的子集。 Specifically only letters, digits and hyphen are allowed.具体来说,只允许使用字母、数字和连字符。 Furthermore the first and last characters cannot be a hyphen.此外,第一个和最后一个字符不能是连字符。 RFC 952 didn't permit a number for the first character, but RFC 1123 subsequently relaxed that. RFC 952 不允许第一个字符使用数字,但 RFC 1123 随后放宽了这一点。

Hence:因此:

  • a - valid a - 有效
  • 0 - valid 0 - 有效
  • a- - invalid a- - 无效
  • ab - valid ab - 有效
  • xn--dasdkhfsd - valid (punycode encoding of an IDN) xn--dasdkhfsd - 有效(IDN 的punycode 编码)

Off the top of my head I don't think it's possible to invalidate the a- example with a single simple regexp.在我的脑海里,我认为不可能用一个简单的正则表达式使a-示例无效。 The best I can come up with to check a single host label is:我能想出的最好的检查单个主机标签是:

if (preg_match('/^[a-z\d][a-z\d-]{0,62}$/i', $label) &&
   !preg_match('/-$/', $label))
{
    # label is legal within a hostname
}

To further complicate matters, some domain name entries (typically SRV records) use labels prefixed with an underscore, eg _sip._udp.example.com .更复杂的是,一些域名条目(通常是SRV记录)使用带有下划线前缀的标签,例如_sip._udp.example.com These are not host names, but are legal domain names.这些不是主机名,而是合法的域名。

I think once you have isolated the domain name, say, using Erklan's idea:我想一旦你隔离了域名,比如说,使用 Erklan 的想法:

$myUrl = "http://www.domain.com/link.php";
$myParsedURL = parse_url($myUrl);
$myDomainName= $myParsedURL['host'];

you could use :你可以使用:

if( false === filter_var( $myDomainName, FILTER_VALIDATE_URL ) ) {
// failed test

}

PHP5s Filter functions are for just such a purpose I would have thought. PHP5 的过滤器函数就是为了这个目的,我会想到。

It does not strictly answer your question as it does not use Regex, I realise.我意识到它没有严格回答您的问题,因为它不使用正则表达式。

Here is another way without regex.这是没有正则表达式的另一种方式。

$myUrl = "http://www.domain.com/link.php";
$myParsedURL = parse_url($myUrl);
$myDomainName= $myParsedURL['host'];
$ipAddress = gethostbyname($myDomainName);
if($ipAddress == $myDomainName)
{
   echo "There is no url";
}
else
{
   echo "url found";
}

Regular expression is the most effective way of checking for a domain validation.正则表达式是检查域验证的最有效方法。 If you're dead set on not using a Regular Expression (which IMO is stupid), then you could split each part of a domain:如果您坚决不使用正则表达式(IMO 很愚蠢),那么您可以拆分域的每个部分:

  • www.万维网。 / sub-domain / 子域
  • domain name域名
  • .extension .扩展名

You would then have to check each character in some sort of a loop to see that it matches a valid domain.然后,您必须检查某种循环中的每个字符,以查看它是否与有效域匹配。

Like I said, it's much more effective to use a regular expression.就像我说的,使用正则表达式要有效得多。

Your regular expression is fine, but you're not using preg_match right.你的正则表达式很好,但你没有正确使用preg_match It returns an int (0 or 1), not a boolean.它返回一个int (0 或 1),而不是一个布尔值。 Just write if(!preg_match($regex, $string)) { ... }只需写if(!preg_match($regex, $string)) { ... }

If you want to check whether a particular domain name or ip address exists or not, you can also use checkdnsrr如果要检查特定的域名或IP地址是否存在,也可以使用checkdnsrr
Here is the doc http://php.net/manual/en/function.checkdnsrr.php这是文档http://php.net/manual/en/function.checkdnsrr.php

If you don't want to use regular expressions, you can try this:如果你不想使用正则表达式,你可以试试这个:

$str = 'domain-name';

if (ctype_alnum(str_replace('-', '', $str)) && $str[0] != '-' && $str[strlen($str) - 1] != '-') {
    echo "Valid domain\n";
} else {
    echo "Invalid domain\n";
}

but as said regexp are the best tool for this.但正如所说的正则表达式是最好的工具。

A valid domain is for me something I'm able to register or at least something that looks like I could register it.一个有效的域对我来说是我可以注册的东西,或者至少是我可以注册的东西。 This is the reason why I like to separate this from "localhost"-names.这就是我喜欢将其与“localhost”名称分开的原因。

And finally I was interested in the main question if avoiding Regex would be faster and this is my result:最后我对主要问题感兴趣,如果避免 Regex 会更快,这是我的结果:

<?php
function filter_hostname($name, $domain_only=false) {
    // entire hostname has a maximum of 253 ASCII characters
    if (!($len = strlen($name)) || $len > 253
    // .example.org and localhost- are not allowed
    || $name[0] == '.' || $name[0] == '-' || $name[ $len - 1 ] == '.' || $name[ $len - 1 ] == '-'
    // a.de is the shortest possible domain name and needs one dot
    || ($domain_only && ($len < 4 || strpos($name, '.') === false))
    // several combinations are not allowed
    || strpos($name, '..') !== false
    || strpos($name, '.-') !== false
    || strpos($name, '-.') !== false
    // only letters, numbers, dot and hypen are allowed
/*
    // a little bit slower
    || !ctype_alnum(str_replace(array('-', '.'), '', $name))
*/
    || preg_match('/[^a-z\d.-]/i', $name)
    ) {
        return false;
    }
    // each label may contain up to 63 characters
    $offset = 0;
    while (($pos = strpos($name, '.', $offset)) !== false) {
        if ($pos - $offset > 63) {
            return false;
        }
        $offset = $pos + 1;
    }
    return $name;
}
?>

Benchmark results compared with velcrow 's function and 10000 iterations ( complete results contains many code variants. It was interesting to find the fastest.):velcrow 的函数和 10000 次迭代相比的基准测试结果( 完整结果包含许多代码变体。找到最快的很有趣。):

filter_hostname($domain);// $domains: 0.43556308746338 $real_world: 0.33749794960022
is_valid_domain_name($domain);// $domains: 0.81832790374756 $real_world: 0.32248711585999

$real_world did not contain extreme long domain names to produce better results. $real_world不包含极长的域名以产生更好的结果。 And now I can answer your question: With the usage of ctype_alnum() it would be possible to realize it without regex, but as preg_match() was faster I would prefer that.现在我可以回答你的问题:使用ctype_alnum()可以在没有正则表达式的情况下实现它,但由于preg_match()更快,我更喜欢它。

If you don't like the fact that "local.host" is a valid domain name use this function instead that valids against a public tld list.如果您不喜欢“local.host”是有效域名这一事实,请使用此功能来代替对公共 tld 列表有效。 Maybe someone finds the time to combine both.也许有人会抽出时间将两者结合起来。

The correct answer is that you don't ... you let a unit tested tool do the work for you:正确的答案是你不......你让一个经过单元测试的工具为你完成工作:

// return '' if host invalid --
private function setHostname($host = '')
{
    $ret = (!empty($host)) ? $host : '';
    if(filter_var('http://'.$ret.'/', FILTER_VALIDATE_URL) === false) {
        $ret = '';
    }
    return $ret;
}

further reading : https://www.w3schools.com/php/filter_validate_url.asp进一步阅读: https : //www.w3schools.com/php/filter_validate_url.asp

I know that this is an old question, but it was the first answer on a Google search, so it seems relevant.我知道这是一个老问题,但它是谷歌搜索的第一个答案,所以它似乎很相关。 I recently had this same problem.我最近遇到了同样的问题。 The solution in my case was to just use the Public Suffix List:在我的情况下的解决方案是只使用公共后缀列表:

https://publicsuffix.org/learn/ https://publicsuffix.org/learn/

The suggested language specific libraries listed should all allow for easy validation of not just domain format, but also top level domain validity.列出的建议的特定于语言的库都应该不仅可以轻松验证域格式,还可以轻松验证顶级域的有效性。

If you can run shell commands, following is the best way to determine if a domain is registered.如果您可以运行 shell 命令,以下是确定域是否已注册的最佳方法。

This function returns false, if domain name isn't registered else returns domain name.如果未注册域名,则此函数返回 false,否则返回域名。

function get_domain_name($domain) { 
    //Step 1 - Return false if any shell sensitive chars or space/tab were found
    if(escapeshellcmd($domain)!=$domain || count(explode(".", $domain))<2 || preg_match("/[\s\t]/", $domain)) {
            return false;
    }

    //Step 2 - Get the root domain in-case of subdomain
    $domain = (count(explode(".", $domain))>2 ? strtolower(explode(".", $domain)[count(explode(".", $domain))-2].".".explode(".", $domain)[count(explode(".", $domain))-1]) : strtolower($domain));

    //Step 3 - Run shell command 'dig' to get SOA servers for the domain extension
    $ns = shell_exec(escapeshellcmd("dig +short SOA ".escapeshellarg(explode(".", $domain)[count(explode(".", $domain))-1]))); 

    //Step 4 - Return false if invalid extension (returns NULL), or take the first server address out of output
    if($ns===NULL) {
            return false;
    }
    $ns = (((preg_split('/\s+/', $ns)[0])[strlen(preg_split('/\s+/', $ns)[0])-1]==".") ? substr(preg_split('/\s+/', $ns)[0], 0, strlen(preg_split('/\s+/', $ns)[0])-1) : preg_split('/\s+/', $ns)[0]);

    //Step 5 - Run another dig using the obtained address for our domain, and return false if returned NULL else return the domain name. This assumes an authoritative NS is assigned when a domain is registered, can be improved to filter more accurately.
    $ans = shell_exec(escapeshellcmd("dig +noall +authority ".escapeshellarg("@".$ns)." ".escapeshellarg($domain))); 
    return (($ans===NULL) ? false : ((strpos($ans, $ns)>-1) ? false : $domain));
}

Pros优点

  1. Works on any domain, while php dns functions may fail on some domains.适用于任何域,而 php dns 功能在某些域上可能会失败。 (my .pro domain failed on php dns) (我的 .pro 域在 php dns 上失败)
  2. Works on fresh domains without any dns (like A) records适用于没有任何 dns(如 A)记录的新域
  3. Unicode friendly Unicode 友好

Cons缺点

  1. Usage of shell execution, probably shell 执行的使用,大概
<?php

if(is_valid_domain('https://www.google.com')==1){
  echo 'Valid';
}else{
   echo 'InValid';
}

 function is_valid_domain($url){

    $validation = FALSE;
    /*Parse URL*/    
    $urlparts = parse_url(filter_var($url, FILTER_SANITIZE_URL));

    /*Check host exist else path assign to host*/    
    if(!isset($urlparts['host'])){
        $urlparts['host'] = $urlparts['path'];
    }

    if($urlparts['host']!=''){
        /*Add scheme if not found*/        if (!isset($urlparts['scheme'])){
        $urlparts['scheme'] = 'http';
        }

        /*Validation*/        
    if(checkdnsrr($urlparts['host'], 'A') && in_array($urlparts['scheme'],array('http','https')) && ip2long($urlparts['host']) === FALSE){ 
        $urlparts['host'] = preg_replace('/^www\./', '', $urlparts['host']);
        $url = $urlparts['scheme'].'://'.$urlparts['host']. "/";            

            if (filter_var($url, FILTER_VALIDATE_URL) !== false && @get_headers($url)) {
                $validation = TRUE;
            }
        }
    }

    return $validation;

}
?>

After reading all the issues with the added functions I decided I need something more accurate.在阅读了有关添加功能的所有问题后,我决定需要更准确的内容。 Here's what I came up with that works for me.这就是我想出的对我有用的东西。

If you need to specifically validate hostnames (they must start and end with an alphanumberic character and contain only alphanumerics and hyphens) this function should be enough.如果您需要专门验证主机名(它们必须以字母数字字符开头和结尾,并且仅包含字母数字和连字符),此功能应该足够了。

function is_valid_domain($domain) {
    // Check for starting and ending hyphen(s)
    if(preg_match('/-./', $domain) || substr($domain, 1) == '-') {
        return false;
    }

    // Detect and convert international UTF-8 domain names to IDNA ASCII form
    if(mb_detect_encoding($domain) != "ASCII") {
        $idn_dom = idn_to_ascii($domain);
    } else {
        $idn_dom = $domain;
    }

    // Validate
    if(filter_var($idn_dom, FILTER_VALIDATE_DOMAIN, FILTER_FLAG_HOSTNAME) != false) {
        return true;
    }
    return false;
}

Note that this function will work on most (haven't tested all languages) LTR languages.请注意,此功能适用于大多数(尚未测试所有语言)LTR 语言。 It will not work on RTL languages.它不适用于 RTL 语言。

is_valid_domain('a');                                                                       Y
is_valid_domain('a.b');                                                                     Y
is_valid_domain('localhost');                                                               Y
is_valid_domain('google.com');                                                              Y
is_valid_domain('news.google.co.uk');                                                       Y
is_valid_domain('xn--fsqu00a.xn--0zwm56d');                                                 Y
is_valid_domain('area51.com');                                                              Y
is_valid_domain('japanese.コム');                                                           Y
is_valid_domain('домейн.бг');                                                               Y
is_valid_domain('goo gle.com');                                                             N
is_valid_domain('google..com');                                                             N
is_valid_domain('google-.com');                                                             N
is_valid_domain('.google.com');                                                             N
is_valid_domain('<script');                                                                 N
is_valid_domain('alert(');                                                                  N
is_valid_domain('.');                                                                       N
is_valid_domain('..');                                                                      N
is_valid_domain(' ');                                                                       N
is_valid_domain('-');                                                                       N
is_valid_domain('');                                                                        N
is_valid_domain('-günter-.de');                                                             N
is_valid_domain('-günter.de');                                                              N
is_valid_domain('günter-.de');                                                              N
is_valid_domain('sadyasgduysgduysdgyuasdgusydgsyudgsuydgusydgsyudgsuydusdsdsdsaad.com');    N
is_valid_domain('2001:db8::7');                                                             N
is_valid_domain('876-555-4321');                                                            N
is_valid_domain('1-876-555-4321');                                                          N

Check the php function checkdnsrr检查php函数checkdnsrr

function validate_email($email){

   $exp = "^[a-z\'0-9]+([._-][a-z\'0-9]+)*@([a-z0-9]+([._-][a-z0-9]+))+$";

   if(eregi($exp,$email)){

      if(checkdnsrr(array_pop(explode("@",$email)),"MX")){
        return true;
      }else{
        return false;
      }

   }else{

      return false;

   }   
}

This is validation of domain name in javascript:这是在 javascript 中的域名验证:

<script>
function frmValidate() {
 var val=document.frmDomin.name.value;
 if (/^[a-zA-Z0-9][a-zA-Z0-9-]{1,61}[a-zA-Z0-9](?:\.[a-zA-Z]{2,})+$/.test(val)){
      alert("Valid Domain Name");
      return true;
 } else {
      alert("Enter Valid Domain Name");
      val.name.focus();
      return false;
 }
}
</script>

This is simple.这很简单。 Some php egnine has a problem with split().一些 php egnine 有 split() 问题。 This code below will work.下面的代码将起作用。

<?php
$email = "vladimiroliva@ymail.com"; 
$domain = strtok($email, "@");
$domain = strtok("@");
if (@getmxrr($domain,$mxrecords)) 
   echo "This ". $domain." EXIST!"; 
else 
   echo "This ". $domain." does not exist!"; 
?>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM