简体   繁体   English

PHP filter_var URL

[英]PHP filter_var URL

For validating a URL path from user input, i'm using the PHP filter_var function. 为了从用户输入验证URL路径,我使用了PHP filter_var函数。 The input only contains the path (/path/path/script.php). 输入仅包含路径(/path/path/script.php)。

When validating the path, I add the host. 验证路径时,我添加了主机。 I'm playing around a little bit, testing the input validation etc. Doing so, i notice a strange(??) behavior of the filter URL function. 我在玩一些,测试输入验证等。这样做,我注意到过滤器URL函数的奇怪行为(??)。

Code: 码:

$url = "http://www.domain.nl/http://www.google.nl/modules/authorize/test/normal.php";
var_dump(filter_var($url, FILTER_VALIDATE_URL, FILTER_FLAG_HOST_REQUIRED)); //valid

Can someone explane why this is a valid URL? 有人可以解释为什么这是一个有效的URL吗? Thanks! 谢谢!

The short answer is, PHP FILTER_VALIDATE_URL checks the URL only against RFC 2396 and your URL, although weird, is valid according to said standard. 简短的答案是,PHP FILTER_VALIDATE_URL仅根据RFC 2396检查URL,并且您的URL尽管很奇怪,但根据上述标准仍然有效。

Long answer: 长答案:

The filter you are using is declared to be compliant with RFC, so let's check that standard ( RFC 2396 ). 您使用的过滤器已声明符合RFC,因此让我们检查一下该标准( RFC 2396 )。

The regular expression used for parsing a URL and listed there is: 用于解析URL并在其中列出的正则表达式为:

^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
 12            3  4          5       6  7        8 9

Where: 哪里:

scheme    = $2
authority = $4
path      = $5
query     = $7
fragment  = $9

As we can see, the ":" character is reserved only in the context of scheme and from that point onwards ":" is fair game (this is supported by the text of the standard). 正如我们所看到的,“:”字符仅在方案的上下文中保留,从那时起,“:”是公平的游戏(标准文本对此提供了支持)。 For example, it is used freely in the http: scheme to denote a port. 例如,在http:方案中可以自由使用它来表示端口。 A slash can also appear in any place and nothing prohibits the URL from having a "//" somewhere in the middle. 斜杠也可以出现在任何位置,没有任何阻止URL中间出现“ //”的地方。 So "http://" in the middle should be valid. 因此,中间的“ http://”应该是有效的。

Let's look at your URL and try to match it to this regexp: 让我们看一下您的URL并尝试将其与此正则表达式匹配:

$url = "http://www.domain.nl/http://www.google.nl/modules/authorize/test/normal.php";
//Escaped a couple slashes to make things work, still the same regexp
$result_rfc = preg_match('/^(([^:\/?#]+):)?(\/\/([^\/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?/',$url);
echo '<p>'.$result_rfc.'</p>';

The test returns '1' so this url is valid. 测试返回“ 1”,因此此URL有效。 This is to be expected, as the rules don't declare urls that have something like 'http://' in the middle to be invalid as we have seen. 这是意料之中的,因为规则并未声明中间带有类似“ http://”之类的网址是无效的,正如我们所看到的那样。 PHP simply mirrors this behaviour with FILTER_VALIDATE_URL. PHP仅使用FILTER_VALIDATE_URL来反映此行为。

If you want a more rigurous test, you will need to write the required code yourself. 如果您想进行更严格的测试,则需要自己编写所需的代码。 For example, you can prevent "://" from appearing more than once: 例如,您可以防止“://”出现多次:

$url = "http://www.domain.nl/http://www.google.nl/modules/authorize/test/normal.php";
$result_rfc = preg_match('/^(([^:\/?#]+):)?(\/\/([^\/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?/',$url);
if (substr_count($url,'://') != 1) {
    $result_non_rfc = false;
} else {
    $result_non_rfc = $result_rfc;
}

You can also try and adjust the regular expression itself. 您也可以尝试调整正则表达式本身。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM