简体   繁体   中英

Regex to replace html tags between parenthesis

I need to replace html tags placed between parenthesis. Following is my code. Any help would be appreciated.

$string  = '<table><tr>Hello{<strong><br/>name<br/></strong>}</tr></table>';
echo preg_replace("/\{<.*?>\}/","",$string);

Required output is

<table><tr>Hello name</tr></table>

@Diksha Try this:

$string  = '<table><tr><td>Hello {<strong><br/>name<br/></strong>}</td></tr></table>';


echo $str = preg_replace_callback("/\{<.*?>\}/", function($m){
    return preg_replace('/\{|\}/',"",strip_tags($m[0]));
}, $string);

You can not do this using a simple regex alone, but you can use a regex to find the paranthesis blocks as follwing

function process_paranthesis($match) {
  return strip_tags($match[1]);
}

$string  = '<table><tr>Hello { <strong>name</strong>}</tr></table>';
echo preg_replace_callback("/\{([^\}]*)\}/", "process_paranthesis",$string);

The RegEx was modified to just find all {...}-blocks and we use preg_replace_callback() , which calls a function that computes the string that the match is to be replaced to. The parameter $match of the callback function contains information about the match in various ways. $match[0] contains the whole text of the match and $match[1] contains the text within the first paranthesis within the match. The function strip_tags() is then used within the callback function, to remove all HTML-Tags. This is a predefined function and should be used instead of reinventing the wheel.

The RegEx is constructed as following:

  1. A match starts with a { and ends with a } ; we need to escape it so we use \\{ ... \\} .
  2. We want to process everything, but the surrounding { and } , so we put round paranthesis inside: \\{( ... )\\} and will then get the whole content within the curly braces as $match[1] without further need to remove those curly braces by using other string functions.
  3. We want to allow all characters between the { and } except for the } itself; we use [^\\}] , which matches every kind of character but } ; and we want to allow multiple of them, resulting in: [^\\}]*

NOTE: .* is greedy. So, if we just use .* instead of [^\\}]* we would get weird results in case there are multiple blocks of curly braces. The match would start at the first opening { and end at the last } within the string and would containing all blocks and everything between it. This would match like this: "Text {in first} something between {and second one} . And some more." -- But we want it to match like this: "Text {in first} something between {and second one} . And some more.", right?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM