Wrapping words in a sentence using regex

Question

I'm converting sentences like:

Phasellus turpis, elit. Tempor et lobortis? Venenatis: sed enim!

to:

_________ ______, ____. ______ __ ________? _________: ___ ____!

using:

utf8_encode(preg_replace("/[^.,:;!?¿¡ ]/", "_", utf8_decode($ss->phrase) ))

But I'm facing a problem: Google is indexing all those empty words as keywords. I'd like to convert the original strings to something invisible to Google, like:

<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span> <span>&nbsp;&nbsp;&nbsp;&nbsp</span>, ....

using:

.parent span { text-decoration:underline; }

that is, wrapping words inside span tags, replacing words' characters with &nbsp ; and leaving untouched the special characters .,:;!?¿¡ and space.

Is this possible to solve using a regex? I actually solved this by using a non very efficient loop that scans every character of the string, but I must scan many sentences per page.

Answer 1

Use preg_replace_callback and have the callback create the appropriate replacement. Something along the lines of (untested)

function replacer($match) {
    return "<span>".str_repeat("&nbsp;",strlen($match[1]))."</span>";
}

// Note the addition of the () and the + near the end of the regex
utf8_encode(preg_replace_callback("/([^.,:;!?¿¡ ]+)/", "replacer", utf8_decode($ss->phrase) ))

Answer 2

$yourphrase = preg_replace('/([^\W]+)/si', '<span>$1</span>', $yourphrase);

this will wrap all the " _ "-words with spans...

imho you need a two-step procedure here, first you have to convert the letters to underscore (which obvious work already?), second you'll have to wrap the " _ "-words in a span (with mine regex).

Wrapping words in a sentence using regex

Question

2 answers

solution1
1 ACCPTED 2012-08-28 03:23:30

solution2
0 2012-08-28 03:04:59

Wrapping words in a sentence using regex

Question

2 answers

solution1 1 ACCPTED 2012-08-28 03:23:30

solution2 0 2012-08-28 03:04:59

solution1
1 ACCPTED 2012-08-28 03:23:30

solution2
0 2012-08-28 03:04:59