[英]preg_match to match src=, background= and url(..)
我想找到一个正则表达式,可以找到(在给定的HTML中)以下图像:
src=""
src=''
background=""
background=''
url("")
url('')
url()
到目前为止,我想到了:
preg_match_all("/src=((\"|'|)?(.*\.(png|gif|jpg))(\"|'|))/Ui", $strHTML, $arrMatches);
preg_match_all("/background=((\"|'|)?(.*\.(png|gif|jpg))(\"|'|))/Ui", $strHTML, $arrMatches);
preg_match_all("/url\((\"|'|)?((.*\.(png|gif|jpg))(\"|'|))\)/Ui", $strHTML, $arrMatches);
但是这些不完整,因为它们不包含前缀(src / background / url)。 另外,从安全角度来看,我认为可以对其进行进一步改进,以防止有人输入src="http://somesite.com/someurl.exe?ext=jpg"
任何在正确方向上的帮助将不胜感激。
编辑:
我想我明白了,尽管代码肯定可以改进,甚至可以组合和/或优化:)
/* match CSS url() links */
preg_match_all("/(url\((\"|'|)(.*\.(png|gif|jpg|jpeg))(\"|'|)\))/Ui", $strHTML, $arrMatches);
Array
(
[0] => Array
(
[0] => url('test1.gif')
[1] => url(test2.gif)
[2] => url("test3.gif")
)
[1] => Array
(
[0] => url('test1.gif')
[1] => url(test2.gif)
[2] => url("test3.gif")
)
[2] => Array
(
[0] => '
[1] =>
[2] => "
)
[3] => Array
(
[0] => test1.gif
[1] => test2.gif
[2] => test3.gif
)
[4] => Array
(
[0] => gif
[1] => gif
[2] => gif
)
[5] => Array
(
[0] => '
[1] =>
[2] => "
)
)
/* match img links */
preg_match_all("/(src=(\"\'??)(.*\.(png|gif|jpg|jpeg))(\"\'??))/Ui", $strHTML, $arrMatches);
/* match background links */
preg_match_all("/(background=(\"\'??)(.*\.(png|gif|jpg|jpeg))(\"\'??))/Ui", $strHTML, $arrMatches);
如果您确定这些属性名称(src,url和背景)...
$arr = array(
'url("http://somesite.com/someurl.exe?src=jpg")',
'url(http://somesite.com/someurl.exe?src=jpg)',
'src="http://somesite.com/someurl.exe?src=jpg"',
'src="http://somesite.com/someurl.exe?ext=jpg"',
'background="http://somesite.com/someurl.exe?src=jpg"'
);
foreach ($arr as $str) {
preg_match_all('/(?<=src=|background=|url\()(\'|")?(?<image>.*?)(?=\1|\))/i',$str,$matches);
echo $str;
foreach($matches['image'] as $img) {
echo "\nimage: <b>$img</b>\n";
}
echo "\n";
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.