currently i am replacing all my quotes inside a text with special quotes. But how can i change my regex that only quotes inside the text will be replaced and not the ones who are used in html tags.
$text = preg_replace('/"(?=\w)/', "»", $text);
$text = preg_replace('/(?<=\w)"/', "«", $text);
I am not that fit in regular expressions. The problem is that i need to replace the starting quotes with another symbol than ending quotes.
If you do need more information, say so.
Any help is appreciated!
EDIT
Test Case
<p>This is a "wonderful long text". At least it should be. Here we have a <a href="http://wwww.site-to-nowhere.com" target="_blank">link</a>.</p>
The expected output should be:
<p>This is a »wonderful long text«. At least it should be. Here we have a <a href="http://wwww.site-to-nowhere.com" target="_blank">link</a>.</p>
Right now it is like this:
<p>This is a »wonderful long text«. At least it should be. Here we have a <a href=»http://wwww.site-to-nowhere.com« target=»_blank«>link</a>.</p>
EDIT 2
Thx for the answer of Kamehameha i've added the following code to my script:
$text = preg_replace("/\\"([^<>]*?)\\"(?=[^>]+?<)/", "»\\1«", $text);
What worked great in the regex tester does not replace anything. Did i do anything wrong?
This regex works for the given strings.
Search for - "([^<>]*?)"(?=[^>]*?<)
Replace with - »\1«
Demo here
Testing it -
INPUT -
<p>This is a "wonderful long text". "Another wonderful ong text" At least it should be. Here we have a <a href="http://wwww.site-to-nowhere.com" target="_blank">link</a>.</p>
OUTPUT -
<p>This is a »wonderful long text«. »Another wonderful ong text« At least it should be. Here we have a <a href="http://wwww.site-to-nowhere.com" target="_blank">link</a>.</p>
EDIT 1-
Executing this in PHP -
$str = '<p>This is a "wonderful long text". "Another wonderful ong text" At least it should be. Here we have a <a href="http://wwww.site-to-nowhere.com" target="_blank">link</a>.</p>';
var_dump(preg_replace('/"([^<>]*?)"(?=[^>]*?<)/', '»\1«', $str));
It's output -
/** OUTPUT **/
string '<p>This is a »wonderful long text«. »Another wonderful ong text« At least it should be. Here we have a <a href="http://wwww.site-to-nowhere.com" target="_blank">link</a>.</p>' (length=196)
EDIT 2-
You have executed the preg_replace
function properly, but in the replacement string, you have used \\1 inside the Double quotes(""). Doing so, you are escaping the 1 itself and that won't be replaced.
To make it more clear, try this and see what happens -
echo '»\1«';
echo "»\1«";
The second \\1 should not be visible.
So the solution would be one of these -
preg_replace('/"([^<>]*?)"(?=[^>]*?<)/', '»\1«', $str)
preg_replace("/\"([^<>]*?)\"(?=[^>]*?<)/", "»\\1«", $str)
preg_replace("/\"([^<>]*?)\"(?=[^>]*?<)/", "»$1«", $str)
Read the Replacement section in this page for more clarity.
EDIT 3-
A regex that covers text which might not be enclosed within tags-
\"([^<>]*?)\"(?=(?:[^>]*?(?:<|$)))
Demo here
Could also use a negative lookahead:
(?![^<]*>)"([^"]+)"
Replace with: »\\1«
For the record, there is a simple PHP solution that was not mentioned and that efficiently skips over all the <a...</a>
tags.
Search: <a.*?<\\/a>(*SKIP)(*F)|"([^"]*)"
Replace: »\\1«
In the Demo , look at the Substitutions at the bottom.
Reference
How to match (or replace) a pattern except in situations s1, s2, s3...
Use this regex:
(?<=^|>)[^><]+?(?=<|$)
This will match non html strings.
And then do your regex on the resultant string
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.