I am using TinyMCE and it is converting all my attribute single quotes to double quotes on cleanup.
This is what I am putting into the editor.
<tr _excel-dimensions='{"row":{"rowHeight":50}}'>
<td _excel-styles='{"font":{"size":20,"color":{"rgb":"333333"},"bold":true},"fill":{"fillType":"solid","startColor":"F0F0F0"},"alignment":{"horizontal":"center"}}' colspan='6'>Affiliate Accounts</td>
</tr>
and this is what the editor does after saving it
<tr _excel-dimensions="{"row":{"rowHeight":50}}">
<td _excel-styles="{"font":{"size":20,"color":{"rgb":"333333"},"bold":true},"fill":{"fillType":"solid","startColor":"F0F0F0"},"alignment":{"horizontal":"center"}}" colspan="6">Accounts</td>
</tr>
There doesn't seem to be a way to override the setting in TinyMCE. So I am turning to RegEx with PHP when saving the data to the database. This is what I have so far, but doesn't seem to be capturing all the double quotes.
$content = preg_replace_callback('/<(.*)(\")(.*)(\")(.*)>/miU', function($matches) {
return "<" . $matches[1] . "'" . html_entity_decode($matches[3]) . "'" . $matches[5] . ">";
}, $content);
It is replacing the json encoded string, but not the colspan="6"
Thanks in advance for the help.
As I said in the comment, it's not very good to parse HTML with regex, better to use special libraries like PHP Simple HTML DOM Parser . However it's possible to construct a regex which will work on a correct HTML.
Our goal is to find all double-quoted strings inside a tag. First let's forget about requirement that the double-quoted string must be inside a tag. Then we can use this:
$content = preg_replace_callback('/"(.*?)"/',
function($matches) {
return "'" . html_entity_decode($matches[1]) . "'"
},
$content);
Now we need to add the check that the double-quoted string is inside a tag. To do this we construct a lookahead expression which checks the text between our double-quoted string and the end of the text:
>
there. It means that there must be some sequence of non- <
, non- >
characters followed by >
. The corresponding regex is [^<>]*>
<
and >
. The regex for a group of characters containing a single tag is [^<]*<[^>]*>
. We need to repeat this group any number of times: (?:[^<]*<[^>]*>)*
<
, non- >
characters till the end of the text: [^<>]*$
The resulting lookahead expression looks a bit terrifying, but does the work: (?=[^<>]*>(?:[^<]*<[^>]*>)*[^<>]*$)
.
Finally, we incorporate this lookahead check into the original regex:
$content = preg_replace_callback('/"(?=[^<>]*>(?:[^<]*<[^>]*>)*[^<>]*$)(.*?)"/',
function($matches) {
return "'" . html_entity_decode($matches[1]) . "'"
},
$content);
You can check it here: Regex101 demo
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.