PHP Regular expression: Get all urls with question mark

Question

I have this regular expression:

preg_match_all("/<a\\s.*?href\\s*=\\s*['|\\"](.*?)(?=#|\\"|')/si", $data, $matches);

to find all urls, it works fine, BUT how can I modificate it to find urls with question marks ONLY?

Example:

<a href="http://site.com/index.php">0</a><a href="http://site.com/index.php?id=1">1</a><a href="http://site.com/calc/index.php?id=1&scheme=Venus">2</a><a href="http://site.com/catalogue/data.php">3</a>

And preg_match_all will return:

http://site.com/index.php?id=1

http://site.com/calc/index.php?id=1&scheme=Venus

Answer 1

preg_match_all("@<a\s*href\s*=[\'\"]([^\'\"]+\?[^\'\"]+)[\'\"]@si", $data, $matches);

尝试这个。

Answer 2

Don't try to make everything happen in one regex. Use your existing method, and then separately check the URL that you get back to see if it has a question mark in it.

That said, don't use regular expressions to parse HTML . You cannot reliably parse HTML with regular expressions, and you will face sorrow and frustration down the road. As soon as the HTML changes from your expectations, your code will be broken. See http://htmlparsing.com/php for examples of how to properly parse HTML with PHP modules that have already been written, tested and debugged.

Answer 3

Andy Lester gave you the answer with right thing to do.

Here's your regex though:

<a\s.*?href\s*=\s*['|\"](.*?\?.*?)(?=#|\"|')

as seen here:

http://rubular.com/r/LHi11VMMR9

PHP Regular expression: Get all urls with question mark

Question

3 answers

solution1
1 ACCPTED 2013-06-15 06:53:17

solution2
0 2013-06-15 05:43:47

solution3
0 2013-06-15 05:44:46

PHP Regular expression: Get all urls with question mark

Question

3 answers

solution1 1 ACCPTED 2013-06-15 06:53:17

solution2 0 2013-06-15 05:43:47

solution3 0 2013-06-15 05:44:46

solution1
1 ACCPTED 2013-06-15 06:53:17

solution2
0 2013-06-15 05:43:47

solution3
0 2013-06-15 05:44:46