使用 PHP Simple HTML DOM Parser 抓取

Question

I want to use PHP simple HTML DOM parser to scrape from a website.我想使用 PHP 简单的 HTML DOM 解析器从网站上抓取。 Source code is so random like that :源代码是如此随机：

      <font face="Arial" color="#ff0000">
      <p>Parameters</p>
      </font><font face="Arial" size="2" color="#ff0000">
      <p>Param1</p>
      </font><font face="Arial" size="2" color="#0000ff">
      <p>Details. (Lob., </font><i><font face="Arial"
      size="2" color="#ff0000">Co v</font><font face="Arial" size="2"
      color="#0000ff">.)</p>

Instead of putting directly "Details. (Lob., Co v.)" inside , it's put using and .不是直接将“Details. (Lob., Co v.)”放在 中，而是使用 和 放置。 When I use this code当我使用此代码时

foreach($html->find('p') as $p) 
{
  echo $p->plaintext.'<br>';
}

I find "Details. (Lob.," it stops when finding or . How can I extract the whole line "Details. (Lob., Co v.)"我找到“详细信息。（Lob。”，它在找到 或 < 字体 > 时停止。如何提取整行“详细信息。（Lob。，Co v.）”

Thank you for your answer谢谢您的回答

Answer 1

You can use strip_tags() function to remove the unnecessary tags.您可以使用strip_tags()函数删除不必要的标签。 after removing unnecessary tags, you can use DOM parser.删除不必要的标签后，您可以使用 DOM 解析器。

The strip_tags() function strips a string from HTML, XML, and PHP tags. strip_tags() 函数从 HTML、XML 和 PHP 标签中去除字符串。

string strip_tags ( string $str [, string $allowable_tags ] ) string strip_tags ( string $str [, string $allowable_tags ] )

You can read more about strip_tags() function on php.net您可以在php.net上阅读有关 strip_tags() 函数的更多信息

Example:例子：

$html = '<font face="Arial" color="#ff0000">
    <p>Parameters</p>
    </font><font face="Arial" size="2" color="#ff0000">
    <p>Param1</p>
    </font><font face="Arial" size="2" color="#0000ff">
    <p>Details. (Lob., </font><i><font face="Arial"
    size="2" color="#ff0000">Co v</font><font face="Arial" size="2"
    color="#0000ff">.)</p>';

$html = strip_tags($string, '<p>');
echo $html;

Result:结果：

  <p>Parameters</p>

  <p>Param1</p>

  <p>Details. (Lob., Co v.)</p>

使用 PHP Simple HTML DOM Parser 抓取

问题描述

1 个解决方案

解决方案1
1 已采纳 2017-01-23 21:25:19

使用 PHP Simple HTML DOM Parser 抓取

问题描述

1 个解决方案

解决方案1 1 已采纳 2017-01-23 21:25:19

解决方案1
1 已采纳 2017-01-23 21:25:19