Is there a (recursive) PCRE regexp in PHP to denest tags

Question

I have an initial string with miscellaneous texts between tags in it, and the string can contain nested tags. I wish to "de-nest" the string according to the following rules : 1) the final string does not differ from the initial one except by adding or deleting some tags. 2) In the final string, every piece of text is enclosed by the nearest pair of tags that enclosed it in the original string. If there are several equally near pairs, the result in unspecified(but 3) no piece of text gets attributed new tags in the final string).

Thus,

[a]text1[/a]text2[b]text3[c]text4[/c]text5[/b]
[e]text6[f]text7[/e]text8[/f]

should become

[a]text1[/a]text2[b]text3[/b][c]text4[/c][b]text5[/b]
[e]text6[/e]...[f]text8[/f]

where … might be any of text7 , [e]text7[/e] or [f]text7[/f] .

Is there a regexp (for example, a recursive PCRE regexp in PHP) that does this ?

Answer 1

Method

Execute 3 replacements:

Search for a closing tag followed by another closing tag ==> insert an opening tag for the second. Example:
```
 [/b]text[/c] ==> [/b][c]text[/c] 
```
Search for an opening tag followed by a tag which is not the closing tag corresponding to the one it has just found ==> insert the closing tag. Example:
```
 [a]text[b] ==> [a]text[/a][b] [a]text[/b] ==> [a]text[/a][/b] 
```
(A fix to 2). Search for 2 consecutive closing tags ==> remove the second. Example:
```
 [a]text[/a][/b] ==> [a]text[/a] 
```

Code

$patterns = array ('#(\[/\w++])([^[]++\[/(\w++)])#',
                   '#\[(\w++)][^[]*+(?!\[/\1)#',
                   '#(\[/(\w++)])\[/\w++]#');
$replace = array ('\1[\3]\2', 
                  '\0[/\1]',
                  '\1');

$string = "[a]text1[/a]text2[b]text3[c]text4[/c]text5[/b]\n[e]text6[f]text7[/e]text8[/f]";

$result = preg_replace($patterns, $replace, $string);

Output

[a]text1[/a]text2[b]text3[/b][c]text4[/c][b]text5[/b]
[e]text6[/e][f]text7[/f][f]text8[/f]

Test it

Is there a (recursive) PCRE regexp in PHP to denest tags

Question

1 answers

solution1
0 2015-09-06 04:51:05

Is there a (recursive) PCRE regexp in PHP to denest tags

Question

1 answers

solution1 0 2015-09-06 04:51:05

solution1
0 2015-09-06 04:51:05