[英]Replace all urls in string not matching url pattern in php
I'm using the following code to filter out urls from a block of HTML text in PHP. 我正在使用以下代码从PHP中的HTML文本块中过滤出URL。
preg_replace('#<a(?![^>]+?href="?http://keepthisdomain.com/foo/bar"?).*?>(.*?)</a>#i', '\1', $text);
It's intended to replace all url's that do not match the specified url pattern. 它旨在替换所有与指定的网址格式不匹配的网址。 However I do want to include all tags that have the attribute rel="shadowbox[a]" set. 但是,我确实要包括所有设置了rel =“ shadowbox [a]”属性的标签。
How can I modify this preg_replace to do that? 如何修改此preg_replace来做到这一点?
You are better off not using regex at all and using a parser instead, for the reasons set forth in this answer . 出于此答案中所述的原因 ,最好不要使用正则表达式,而应使用解析器。
That said, you can do it with regex, but it's tricky: 也就是说,您可以使用正则表达式来做到这一点,但这很棘手:
preg_replace('#<a(?![^>]+?\bhref="?http://keepthisdomain\.com/foo/bar"?|[^>]+\brel="shadowbox\[a\]").*?>(.*?)</a>#i', '\1', $text);
Details on the regex: 正则表达式的详细信息:
<a(?![^>]+?\bhref="?http://keepthisdomain\.com/foo/bar"?|[^>]+\brel="shadowbox\[a\]").*?>(.*?)</a>
Out of the following four tags, only the third would be replaced: 在以下四个标签中,只有第三个将被替换:
<a href="http://keepthisdomain.com/foo/bar">foo</a> // left alone
<a href="http://keepthisdomain.com/foo/bar" rel="shadowbox[a]">foo</a> // left alone
<a href="http://rejectthis.com/foo/bar">foo</a> // REPLACED
<a href="http://rejectthis.com/foo/bar" rel="shadowbox[a]">foo</a> // left alone
Edited with a minor tweak to make it match a literal .
进行了细微的编辑,使之与文字匹配.
in .com
, using \\.
在.com
,使用\\.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.