简体   繁体   中英

Most efficient way to check a URL

I'm trying to check if a user submitted URL is valid, it goes directly to the database when the user hits submit. So far, I have:

$string = $_POST[url];
if (strpos($string, 'www.') && (strpos($string, '/')))
{
    echo 'Good';
}

The submitted page should be a page in a directory, not the main site, so http://www.address.com/page How can I have it check for the second / without it thinking it's from http:// and that doesn't include .com ?

Sample input:

 Valid:
     http://www.facebook.com/pageName
     http://www.facebook.com/pageName/page.html
     http://www.facebook.com/pageName/page.*

Invalid:
     http://www.facebook.com
     facebook.com/pageName
     facebook.com

See the parse_url() function. This will give you the "/page" part of the URL in a separate string, which you can then analyze as desired.

if(!parse_url('http://www.address.com/page', PHP_URL_PATH)) {
    echo 'no path found';
}

See parse_url reference.

filter_var($url, FILTER_VALIDATE_URL, FILTER_FLAG_PATH_REQUIRED)

More information here :

http://ca.php.net/filter_var

Maybe strrpos will help you. It will locate the last occurrence of a string within a string

To check the format of the URL you could use a regular expression:

preg_match [ http://php.net/manual/en/function.preg-match.php ] is a good start, but a knowledge of regular expressions is needed to make it work.

Additionally, if you actually want to check that it's a valid URL, you could check the URL value to see if it actually resolves to a web page:

function check_404($url) {
    $return = @get_headers($url);

    if (strpos($return[0], ' 404 ') === false)
        return true;
    else {
        return false;
    }
}

Try using a regular expression to see that the URL has the correct structure. Here's more reading on this . You need to learn how PCRE works.

A simple example for what you want (disclaimer: not tested, incomplete).

function isValidUrl($url) {
    return preg_match('#http://[^/]+/.+#', $url));
}

From here: http://www.blog.highub.com/regular-expression/php-regex-regular-expression/php-regex-validating-a-url/

    <?php
/**
* Validate URL
* Allows for port, path and query string validations
* @param    string      $url       string containing url user input
* @return   boolean     Returns TRUE/FALSE
*/
function validateURL($url)
{
$pattern = '/^(([\w]+:)?\/\/)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+)?@)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,4}(:[\d]+)?(\/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&amp;?([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?$/';
return preg_match($pattern, $url);
}

$result = validateURL('http://www.google.com');
print $result;
?>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM