在PHP中验证URL的最安全方法是什么？

Question

I'm working on a snippet and I needed to validate URLs so I know that I'm sending data to the correct URL, for this I am using filter_var() function. 我正在编写代码段，因此需要验证URL，因此我知道我正在将数据发送到正确的URL，为此，我正在使用filter_var()函数。

I started encountering issues with this when I started testing, this is my code; 当我开始测试时，我开始遇到这个问题，这是我的代码。

<?php

function post($webLink){

    $url = filter_var($webLink, FILTER_SANITIZE_URL);

    if (filter_var($url, FILTER_VALIDATE_URL)) {

        echo 'Correct';
    }

    else {

        echo 'Please check your url.';
    }

}

    post('h://www.google.com');
?>

A lot of invalid links validated as correct urls including the current one. 许多无效链接被验证为正确的URL，包括当前的URL。

Links that got validated are; 得到验证的链接是；

    ht1tp://www.google.com
    h://ww.google.com
    http://www.google.
    http://www.google.343

I refuse to believe that it is the function validating these links as correct, I'd like to think that something is wrong in my if (filter_var($url, FILTER_VALIDATE_URL)) line. 我拒绝相信这是验证这些链接是否正确的函数，我想认为我的if (filter_var($url, FILTER_VALIDATE_URL))行中有问题。 I need clarification on how to properly use this please. 我需要澄清如何正确使用此功能。 Thanks 谢谢

Answer 1

First, only validate input. 首先，仅验证输入。 Never sanitize input. 切勿清理输入。 Do not sanitize until it is ready to become output. 在准备好输出之前，请勿消毒。 This is a general rule of handling data across the board, and is just as important for displaying URLs securely as it is for preventing XSS attacks, SQL injections, and the like. 这是处理所有数据的一般规则，对于安全显示URL和防止XSS攻击，SQL注入等同样重要。

Second, the FILTER_VALIDATE_URL validates URLs based on RFC 2396. That RFC does not specify any specific scheme, though it does give several examples (ie, HTTP:, GOPHER:, MAILTO:, etc.). 其次，FILTER_VALIDATE_URL会基于RFC 2396验证URL。尽管RFC给出了几个示例（例如，HTTP：，GOPHER：，MAILTO：等），但RFC没有指定任何特定的方案。 The PHP manual on the validate filters explicitly states: 验证过滤器上的PHP手册明确指出：

Beware a valid URL may not specify the HTTP protocol http:// so further validation may be required to determine the URL uses an expected protocol, eg ssh:// or mailto:. 当心有效的URL可能未指定HTTP协议http：//，因此可能需要进一步的验证才能确定URL使用预期的协议，例如ssh：//或mailto:。

Also, the RFC does not define the structure of domain names, or expect any specific top level domains. 此外，RFC并未定义域名的结构，也没有规定任何特定的顶级域名。 Thus, the validate filter does not check those. 因此，验证过滤器不会检查这些内容。 The domain names are formally assigned by registrars following ICANN rules, but you are free to configure your own local DNS server to create any entries that you want, including create TLD-only entries, thus any domain name is valid, whether it passes the validation filter or not. 域名由注册服务商按照ICANN规则正式分配，但是您可以自由配置自己的本地DNS服务器以创建所需的任何条目，包括仅创建TLD条目，因此任何域名都是有效的，无论它是否通过验证是否过滤。

The most secure way to validate some well defined data is to whitelist it. 验证某些定义明确的数据的最安全方法是将其列入白名单。 If you really want to make sure that nobody is passing you "ht tp:com.google.xssHackHere" then you will need to do further checking on your own. 如果您确实想确保没有人通过您的“ ht＆nbsp; tp：com.google.xssHackHere”，那么您将需要自己做进一步的检查。 Be aware that there are now several hundred valid TLDs, and not all of them are easily expressed in ASCII characters, if you want to validate domain names as well as the scheme. 请注意，如果要验证域名和方案，那么现在有数百个有效的TLD，并且并非所有的TLD都容易用ASCII字符表示。

在PHP中验证URL的最安全方法是什么？

问题描述

1 个解决方案

解决方案1
4 已采纳 2015-10-16 16:06:23

在PHP中验证URL的最安全方法是什么？

问题描述

1 个解决方案

解决方案1 4 已采纳 2015-10-16 16:06:23

解决方案1
4 已采纳 2015-10-16 16:06:23