[英]How to get an html content when there are linebreaks using a preg_match?
How to include the line-breaks in my regex when doing a preg_match
of html source? 如何在执行
preg_match
的html源代码时在我的正则表达式中包含换行符?
php PHP
preg_match('/Корица:<\/b><\/td><td>(.*)<\/td>/im', $table[0], $korica);
html(also this is what is in $table[0]
): html(这也是
$table[0]
):
<tr>
<td><b>Година на издаване:</b></td>
<td itemprop="datePublished">2009</td>
</tr>
<tr>
<td><b>Корица:</b></td>
<td>Мека</td>
</tr>
<tr>
<td><b>Език:</b></td>
<td itemprop="inLanguage">Български</td>
</tr>
<tr>
<td><b>Средна оценка:</b></td>
<td> Продуктът няма оценка </td>
</tr>
If i use preg_match_all
i will get all the html after the Корица. 如果我使用
preg_match_all
我将获得Корица之后的所有html。 But what i want is to get only this Meka from the html. 但我想要的是从HTML获得这个Meka 。
将正则表达式中的(.*)
部分更改为非贪婪(.*?)
,甚至更好 - ([^<]*)
,它匹配所有非<
。
If "Meka" is always alphanumeric, then something like this might work: 如果“Meka”始终是字母数字,那么这样的东西可能会起作用:
preg_match('/Корица:<\/b><\/td><td>([a-zA-Z0-9]*)<\/td>/im', $table[0], $korica);
[a-zA-Z0-9]* should match only alphanumeric characters. [a-zA-Z0-9] *应仅匹配字母数字字符。 You might have to consider a space too, in which case you should use [a-zA-Z0-9 ]* (Notice the space before the closing ])
您可能还需要考虑一个空格,在这种情况下您应该使用[a-zA-Z0-9] *(注意结束前的空格)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.