简体   繁体   English

在preg_replace中使用正则表达式来匹配html href锚标记

[英]Using regex in preg_replace to match an html href anchor tag

I'm trying to use preg_replace to replace 我正在尝试使用preg_replace进行替换

<a href="WWW.ANYURL.COM">DISPLAY_TEXT</a>

with

<a href="WWW.ANYURL.COM">DISPLAY_TEXT</a>

here is my code: 这是我的代码:

$string = htmlentities(mysql_real_escape_string($string1)); 
$newString = preg_replace('#&lt;a\ href=&quot;([^&]*)&quot;&gt;([^&]*)&lt;/a&gt;#','<a href="$1">$2</a>',$string);

If I do limited tests such as: 如果我进行有限测试,例如:

$newString = preg_replace('#&lt;a\ href#','TEST',$string);

then 然后

&lt;a href=&quot;WWW.ANYURL.COM&quot;&gt;DISPLAYTEXT&lt;/a&gt;

becomes

TEST=&quot;WWW.ANYURL.COM&quot;&gt;DISPLAYTEXT&lt;/a&gt;

But if I try to get it to also match the "=" it acts as if it could't find a match, ie 但是,如果我尝试使其也匹配“ =”,它的作用就好像找不到匹配项,即

$newString = preg_replace('#&lt;a\ href=#','TEST',$string);

returns the original unchanged: 返回原始值不变:

&lt;a href=&quot;WWW.ANYURL.COM&quot;&gt;DISPLAY_TEXT&lt;/a&gt;

I've been going at this for a couple hours, any help would be greatly appreciated. 我已经做了几个小时,任何帮助将不胜感激。

EDIT: code in context 编辑:上下文中的代码

$title = clean_input($_POST['title']);
$story = clean_input($_POST['story']);

function clean_input($string) 
  { 
  if(get_magic_quotes_gpc())
  {
   $string = stripslashes($string);
  }
$string = htmlentities(mysql_real_escape_string($string)); 
$findValues = array("&lt;b&gt;","&lt;/b&gt;");
$newValues = array("<b>", "</b>");
$newString = str_replace($findValues, $newValues, $string);
$newString2 = preg_replace('#&lt;a\ href=&quot;([^&]*)&quot;&gt;([^&]*)&lt;/a&gt;#','<a href="$1">$2</a>',$newString);
return $newString2;
}

Sample $story = Lorem ipsum dolor sit amet, consectetur adipiscing elit. 样本$ story = Lorem ipsum dolor坐满了,一直奉献着精英。 <a href="www.google.com">Google</a> Vivamus quis sem felis. <a href="www.google.com">Google</a> Vivamus quis sem felis。 Morbi vitae neque ac neque blandit malesuada lobortis sit amet justo. Morbi vitae neque ac neque blandit malesuada lobortis坐在amet justo。 Donec convallis, nibh ut lacinia tempor, neque felis scelerisque nibh, at feugiat lectus erat in nulla. Donec convallis,尼古拉·尼古拉时代,neque felis scelerisque nibh,在野外的feugiat lectus erat。 In et euismod nunc. 在et euismod中。 <pernicious code></code> Pellentesque vitae ante orci, vitae ultrices neque. <pernicious code></code> Pellentesque vitae ante orci,netal ultrices neque。 <a href="www.yahoo.com">Yahoo</a> In non nulla sapien, vestibulum faucibus metus. <a href="www.yahoo.com">Yahoo</a>在非零智者中,前庭faucibus metus。 Fusce egestas viverra arcu, <b>ac</b> sagittis leo facilisis in. Nulla facilisi. Fusce egestas viverra arcu, <b>ac</b> sagittis leo facilisis in。Nulla facilisi。

I want only a few tags like href and bold to be allowed through as code. 我只希望将诸如href和粗体之类的一些标签作为代码允许通过。

You don't need to manually replace anything. 您无需手动更换任何东西。 If this is your whole input string, then use html_entity_decode() to turn the escapes back into < and > . 如果这是您的整个输入字符串,请使用html_entity_decode()将转义符html_entity_decode()<>


Again, your regex works as intended with the sample text. 同样,您的正则表达式可与示例文本一起使用。

Your problem is the premature mysql_real_escape_string() call. 您的问题是过早的mysql_real_escape_string()调用。 It adds backslashes to the " double quotes in your html, and that's why back-converting fails (your regex is not prepared for finding \\&quot; ). 它增加了反斜线的"在HTML双引号,这就是为什么回转换失败(您正则表达式是不是寻找准备\\&quot;

Avoid that. 避免那样。 Get rid of the ugly clean_string() hack and magic_quotes as advised by the manual . 按照手册的建议,摆脱难看的clean_string() hack和magic_quotes You must do the database escaping right before inserting into the database, not earlier. 您必须插入数据库之前 (而不是更早)进行数据库转义。 (Or better yet use the easier PDO with prepared statements .) (或者更好的方法是使用带有准备好的语句的更简单的PDO 。)

Also avoid the $newString123 variable duplicates, just overwrite the one you already have when rewriting strings. 也要避免$newString123变量重复,只要覆盖重写字符串时已经存在的变量即可。

You could also do it like this: 您也可以这样:

$str = "&lt;a href=&quot;WWW.ANYURL.COM&quot;&gt;DISPLAY_TEXT&lt;/a&gt;";
echo "Your html code is thus: " . htmlspecialchars_decode($str);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM