简体   繁体   English

preg_replace_callback 正则表达式匹配所有 url 并避免图像

[英]preg_replace_callback regex match all urls and avoid images

I have this regex to match all urls and works great:我有这个正则表达式来匹配所有网址并且效果很好:

$regex ='@((https?://)([-\w]+\.[-\w\.]+)+\w(:\d+)?(/([-\w/_\.\,]*(\?\S+)?)?)*)@';                           
return preg_replace_callback( $regex, 'replacing' , $content );

I need to avoid match urls inside: src="***" and <a href="">***</> want to keep the text but replace url.我需要避免在内部匹配 url: src="***"<a href="">***</>想要保留文本但替换 url。

I've tried: adding negation to my regex:我试过:向我的正则表达式添加否定:

$regex ='@((?!src="|?!>)(https?://)([-\w]+\.[-\w\.]+)+\w(:\d+)?(/([-\w/_\.\,]*(\?\S+)?)?)*)@';

first negation when url is startin with src=" second negation is when is between a href a href >当 url 以 src=" 开始时第一次否定第二次否定是在 href 和 href > 之间

Any ideas to make it work?有什么想法让它发挥作用吗?

A good starting place is lib_autolink which handles the <a> case and could be easily adapted for the <img> case.一个好的起点是lib_autolink ,它处理<a>情况并且可以很容易地适应<img>情况。 It is non-trivial and perhaps impossible to do this in a single regexp unless you can guarantee that the HTML is perfectly valid (no stray quotation marks in text, etc).除非您可以保证 HTML 完全有效(文本中没有杂散的引号等),否则在单个正则表达式中执行此操作并非易事,而且可能是不可能的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM