I have an initial string with miscellaneous texts between tags in it, and the string can contain nested tags. I wish to "de-nest" the string according to the following rules : 1) the final string does not differ from the initial one except by adding or deleting some tags. 2) In the final string, every piece of text is enclosed by the nearest pair of tags that enclosed it in the original string. If there are several equally near pairs, the result in unspecified(but 3) no piece of text gets attributed new tags in the final string).
Thus,
[a]text1[/a]text2[b]text3[c]text4[/c]text5[/b]
[e]text6[f]text7[/e]text8[/f]
should become
[a]text1[/a]text2[b]text3[/b][c]text4[/c][b]text5[/b]
[e]text6[/e]...[f]text8[/f]
where …
might be any of text7
, [e]text7[/e]
or [f]text7[/f]
.
Is there a regexp (for example, a recursive PCRE regexp in PHP) that does this ?
Method
Execute 3 replacements:
Search for a closing tag followed by another closing tag ==> insert an opening tag for the second. Example:
[/b]text[/c] ==> [/b][c]text[/c]
Search for an opening tag followed by a tag which is not the closing tag corresponding to the one it has just found ==> insert the closing tag. Example:
[a]text[b] ==> [a]text[/a][b] [a]text[/b] ==> [a]text[/a][/b]
(A fix to 2). Search for 2 consecutive closing tags ==> remove the second. Example:
[a]text[/a][/b] ==> [a]text[/a]
Code
$patterns = array ('#(\[/\w++])([^[]++\[/(\w++)])#',
'#\[(\w++)][^[]*+(?!\[/\1)#',
'#(\[/(\w++)])\[/\w++]#');
$replace = array ('\1[\3]\2',
'\0[/\1]',
'\1');
$string = "[a]text1[/a]text2[b]text3[c]text4[/c]text5[/b]\n[e]text6[f]text7[/e]text8[/f]";
$result = preg_replace($patterns, $replace, $string);
Output
[a]text1[/a]text2[b]text3[/b][c]text4[/c][b]text5[/b]
[e]text6[/e][f]text7[/f][f]text8[/f]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.