简体   繁体   中英

Replace all quotes that are not in html-tags

currently i am replacing all my quotes inside a text with special quotes. But how can i change my regex that only quotes inside the text will be replaced and not the ones who are used in html tags.

$text = preg_replace('/"(?=\w)/', "»", $text);
$text = preg_replace('/(?<=\w)"/', "&laquo;", $text);

I am not that fit in regular expressions. The problem is that i need to replace the starting quotes with another symbol than ending quotes.

If you do need more information, say so.

Any help is appreciated!

EDIT

Test Case

<p>This is a "wonderful long text". At least it should be. Here we have a <a href="http://wwww.site-to-nowhere.com" target="_blank">link</a>.</p>

The expected output should be:

<p>This is a &raquo;wonderful long text&laquo;. At least it should be. Here we have a <a href="http://wwww.site-to-nowhere.com" target="_blank">link</a>.</p>

Right now it is like this:

<p>This is a &raquo;wonderful long text&laquo;. At least it should be. Here we have a <a href=&raquo;http://wwww.site-to-nowhere.com&laquo; target=&raquo;_blank&laquo;>link</a>.</p>

EDIT 2

Thx for the answer of Kamehameha i've added the following code to my script:

$text = preg_replace("/\\"([^<>]*?)\\"(?=[^>]+?<)/", "&raquo;\\1&laquo;", $text);

What worked great in the regex tester does not replace anything. Did i do anything wrong?

This regex works for the given strings.

Search for   - "([^<>]*?)"(?=[^>]*?<)
Replace with - &raquo;\1&laquo;

Demo here
Testing it -

INPUT - 
<p>This is a "wonderful long text". "Another wonderful ong text" At least it should be. Here we have a <a href="http://wwww.site-to-nowhere.com" target="_blank">link</a>.</p>

OUTPUT - 
<p>This is a &raquo;wonderful long text&laquo;. &raquo;Another wonderful ong text&laquo; At least it should be. Here we have a <a href="http://wwww.site-to-nowhere.com" target="_blank">link</a>.</p>

EDIT 1-
Executing this in PHP -

$str = '<p>This is a "wonderful long text". "Another wonderful ong text" At least it should be. Here we have a <a href="http://wwww.site-to-nowhere.com" target="_blank">link</a>.</p>';
var_dump(preg_replace('/"([^<>]*?)"(?=[^>]*?<)/', '&raquo;\1&laquo', $str));

It's output -

/** OUTPUT **/
string '<p>This is a &raquo;wonderful long text&laquo. &raquo;Another wonderful ong text&laquo At least it should be. Here we have a <a href="http://wwww.site-to-nowhere.com" target="_blank">link</a>.</p>' (length=196)

EDIT 2-
You have executed the preg_replace function properly, but in the replacement string, you have used \\1 inside the Double quotes(""). Doing so, you are escaping the 1 itself and that won't be replaced.
To make it more clear, try this and see what happens -

echo '&raquo;\1&laquo;';
echo "&raquo;\1&laquo;";

The second \\1 should not be visible.
So the solution would be one of these -

preg_replace('/"([^<>]*?)"(?=[^>]*?<)/', '&raquo;\1&laquo;', $str)
preg_replace("/\"([^<>]*?)\"(?=[^>]*?<)/", "&raquo;\\1&laquo;", $str)
preg_replace("/\"([^<>]*?)\"(?=[^>]*?<)/", "&raquo;$1&laquo;", $str)

Read the Replacement section in this page for more clarity.

EDIT 3-
A regex that covers text which might not be enclosed within tags-

\"([^<>]*?)\"(?=(?:[^>]*?(?:<|$)))

Demo here

Could also use a negative lookahead:

(?![^<]*>)"([^"]+)"

Replace with: &raquo;\\1&laquo;

For the record, there is a simple PHP solution that was not mentioned and that efficiently skips over all the <a...</a> tags.

Search: <a.*?<\\/a>(*SKIP)(*F)|"([^"]*)"

Replace: &raquo;\\1&laquo;

In the Demo , look at the Substitutions at the bottom.

Reference

How to match (or replace) a pattern except in situations s1, s2, s3...

Use this regex:

(?<=^|>)[^><]+?(?=<|$)

This will match non html strings.

And then do your regex on the resultant string

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM