简体   繁体   中英

regular expression parse url parameter

I'm having trouble retrieving a URL parameter from a string using regular expressions:

An example string could be

some text and http://google.com/?something=this&tag=yahoo.com and more text , and I would like to be able to find yahoo.com from this.

The caveat is that I need to ensure that the string begins with http://google.com , and not just search for &tag=(.*)

preg_match("/google\\.com\\/.*&tag=(.*) $/", $subject, $matches)

i'm hoping this matches anything with google.com followed by anything, followed by &tag= followed by a space. Ultimately the goal is to parse out all of the tag= values from google.com URLs.

Is there a better way to accomplish this?

Update:

so I have this new regex: /google\\.com\\/.*(tag=.*)/ but i'm not sure how to get it to end on a space after the URL

get friendly with the parse_url() function!

$pieces = parse_url('some text http://google.com/?something=this&tag=yahoo.com and whatever');
$query = explode('&', $pieces['query']);

parse_str($pieces['query'], $get);
array_walk($get, function(&$item){
    if (!$sp = strpos($item, ' ')) return;
    $item = substr($item, 0, $sp);
});

var_dump($get); // woo!

edit: thanks to Johnathan for the parse_str() function.

If you want to get the value of tag then the following regex will do the job:

$string = 'some text and http://google.com/?something=this&tag=yahoo.com
and more text
http://google.com/?something=this&tag=yahoo2.com&param=test
';
preg_match_all('#http://google.com\S+&tag=([^\s&]+)#', $string, $m);
print_r($m[1]);

Output

Array
(
    [0] => yahoo.com
    [1] => yahoo2.com
)

Explanation

  • http://google.com : match http://google.com
  • \\S+ : match non whitespace one or more times
  • &tag= : match &tag=
  • ([^\\s&]+) : match anything except whitespace and & one or more times and group it

If you want, you may even add s? after http to take in account for https , or add the i modifier to match case insensitive.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM