简体   繁体   中英

(preg_replace) regex replace all &amp in <a href=“”>

I somehow can't get this to work: I have a simple string, for example:

<p>Foo &amp; Bar</p> // <-- this should still be &amp;
<a href="http://test.com/?php=true&amp;test=test&amp;p=p"> // <- This string should only be affected and be changed to &
<div> Yes &uuml; No</div> // <-- This should still be &uuml;

<a href="http://mycoolpage.com/?page=1&amp;fun=true&amp;foo=bar&amp;yes=no">

Now I want to replace all the &amp; with only & with preg_replace and I tried to create a regex for this, but somehow I can't get it to work.

This is how far I've come, it finds only the last &amp; and also matches the whole string before it and fails to find the other. What am I doing wrong?

(?>=href\\=\\").*?(&amp;)(?=\\")

Edit: It is not possible to use htmlentities_decode or htmlspecialchars_decode, as there is other Code that would get affected.

The natural way I see without knowing in depth the PHP regex API is to match the string against the pattern until there are no more matches, eg when the last &amp; is replaced, there will be no more matches

$str = "<p>Foo &amp; Bar</p> // <-- this should still be &amp;
    <a href=\"http://mycoolpage.com/?page=1&amp;fun=true&amp;foo=bar&amp;yes=no\">";
$pattern = "/(href=\".*?)(&amp;)(.*?\">)/";


while (preg_match_all($pattern, $str, $matches)) {
    $left = $matches[1][0]; // e.g. href="http://....?page=1
    $before = substr($str, 0, strpos($str, $left)); // <p>Foo &amp; ....
    $index = strlen($before) + strlen($left);
    $str = substr_replace($str, "&", $index, strlen("&amp;"));
}

var_dump($str);

result:

<p>Foo &amp; Bar</p> // <-- this should still be &amp; <a href="http://mycoolpage.com/?page=1&fun=true&foo=bar&yes=no">

This comment by Wiktor Stribiżew has worked:

Or a harder way: http://ideone.com/ADku3b

<?php
$s = '<a href="http://myurl.com/?page=1&amp;fun=true&amp;foo=bar&amp;yes=no">';
echo preg_replace_callback('~(<a\b[^>]*href=)(([\'"]).*?\3|\S+)([^>]*>)~', function ($m) {
  return $m[1] . html_entity_decode($m[2]) . $m[4];
}, $s);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM