separate string with reg ex

Question

When I try to separate this string:

<b>Pristatymo laikas: </b>08-17h (visoje Lietuvoje)<br /><b>Dovanų kuponai:</b> <br />Panaudotas 200.00 Lt. dovanų kuponas, kurio kodas: xxxxx<br /><b>Mokėtina suma:</b> 12.00 Lt. <br />

with reg ex pattern:

<b>(.*)</b>

I get this match:

<b>Pristatymo laikas: </b>08-17h (visoje Lietuvoje)<br /><b>Dovanų kuponai:</b> <br />Panaudotas 200.00 Lt. dovanų kuponas, kurio kodas: xxxxx<br /><b>Mokėtina suma:</b>

But I what get all words in <b> tag separated like:

<b>Pristatymo laikas: </b>
<b>Dovanų kuponai:</b>
<b>Mokėtina suma:</b>

How to write correct pattern?

Answer 1

Use .*? instead:

<b>(.*?)</b>

The ? quantifier (non-greedy) matches as little as possible and thus stops at the first encounter of </b>

DEMO

Answer 2

You need to follow .* with ? for a non-greedy match.

<b>(.*?)</b>

Although you can do this with a simple regular expression, it is better to use a Parser for this.

$html = '<b>Pristatymo laikas: </b>08-17h (visoje Lietuvoje)<br />
<b>Dovanų kuponai:</b> <br />Panaudotas 200.00 Lt. dovanų kuponas, kurio kodas:
xxxxx<br /><b>Mokėtina suma:</b> 12.00 Lt. <br />';

$doc = new DOMDocument();
$doc->loadHTML($html); 

$xpath = new DOMXPath($doc);

foreach ($xpath->query('//b') as $tag) {
   echo $tag->ownerDocument->saveHTML($tag) . "\n";
}

Output :

<b>Pristatymo laikas: </b>
<b>DovanÅ³ kuponai:</b>
<b>MokÄtina suma:</b>

separate string with reg ex

Question

2 answers

solution1
3 ACCPTED 2014-06-26 13:07:42

solution2
1 2014-06-26 13:16:43

separate string with reg ex

Question

2 answers

solution1 3 ACCPTED 2014-06-26 13:07:42

solution2 1 2014-06-26 13:16:43

solution1
3 ACCPTED 2014-06-26 13:07:42

solution2
1 2014-06-26 13:16:43