简体   繁体   English

如果使用preg_match进行换行,如何获取html内容?

[英]How to get an html content when there are linebreaks using a preg_match?

How to include the line-breaks in my regex when doing a preg_match of html source? 如何在执行preg_match的html源代码时在我的正则表达式中包含换行符?

php PHP

preg_match('/Корица:<\/b><\/td><td>(.*)<\/td>/im', $table[0], $korica);

html(also this is what is in $table[0] ): html(这也是$table[0] ):

<tr>

            <td><b>Година на издаване:</b></td>

            <td itemprop="datePublished">2009</td>

          </tr>

          <tr>

            <td><b>Корица:</b></td>

            <td>Мека</td>

          </tr>

          <tr>

            <td><b>Език:</b></td>

            <td itemprop="inLanguage">Български</td>

          </tr>





                      <tr>

            <td><b>Средна оценка:</b></td>

            <td>                  Продуктът няма оценка                  </td>

          </tr>

If i use preg_match_all i will get all the html after the Корица. 如果我使用preg_match_all我将获得Корица之后的所有html。 But what i want is to get only this Meka from the html. 但我想要的是从HTML获得这个Meka

将正则表达式中的(.*)部分更改为非贪婪(.*?) ,甚至更好 - ([^<]*) ,它匹配所有非<

If "Meka" is always alphanumeric, then something like this might work: 如果“Meka”始终是字母数字,那么这样的东西可能会起作用:

    preg_match('/Корица:<\/b><\/td><td>([a-zA-Z0-9]*)<\/td>/im', $table[0], $korica);

[a-zA-Z0-9]* should match only alphanumeric characters. [a-zA-Z0-9] *应仅匹配字母数字字符。 You might have to consider a space too, in which case you should use [a-zA-Z0-9 ]* (Notice the space before the closing ]) 您可能还需要考虑一个空格,在这种情况下您应该使用[a-zA-Z0-9] *(注意结束前的空格)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM