I have a piece of html code like the following one:
<td width="24%"><b>Something</b></td>
<td width="1%"></td>
<td width="46%" align="center">
<p><b>
needed
value</b></p>
</td>
<td width="28%" align="center">
</td>
</tr>
What is a good regex pattern to extract the first text node (not tags but the text inside) after the word Something
I mean I want to extract
needed
value
and nothing more.
I cant figure out a working regex pattern in php.
EDIT: I am not parsing whole html document but few lines of it so all I want is to do it using Regex and no HTML parsers.
Ignoring potential issues parsing HTML with regex, the following pattern should match your example code:
Something(?:(?:<[^>]+>)|\s)*([\w\s*]+)
This will match Something
, followed by any list of HTML tags (or whitespace) and match the very next block of text, \\w
(including whitespace).
You can use this in PHP's preg_match()
method like this:
if (preg_match('/Something(?:(?:<[^>]+>)|\s)*([\w\s*]+)/', $inputString, $match)) {
$matchedValue = $match[1];
// do whatever you need
}
Regex Explained:
Something # has to start with 'Something'
(?: # non-matching group
(?: # non-matching group
<[^>]+> # any HTML tags, <...>
)
| \s # OR whitespace
)* # this group can match 0+ times
(
[\w\s*]+ # any non-HTML words (with/without whitespace)
)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.