如何从网页中提取特定div的内容？

Question

我想从网页class='box'使用class='box'加载特定div的内容，为此我使用了简单HTML DOM。 但我无法为preg_match编写清晰的模式，这是我的php代码：

<?php
   $url = "http://www.example.com/pages/";
   $page_all = file_get_contents($url); 

   preg_match(...?);


   echo "<pre>";
   print_r($div_array[0]);
   echo "</pre>";
?>

请帮助我为preg_match编写正确的模式

Answer 1

SimpleHtmlDOM：

$html = new simple_html_dom();

// Load from a string
$html->load('<html><body><p>Hello World!</p><p>We're here</p></body></html>');

// Load a file
$html->load_file('http://net.tutsplus.com/');

# get an element representing the second paragraph  
$element = $html->find("div[class=box1]");

#access HTML attr
$element->innertext .= "Somthing";

#save and echo
echo $element->save();

Answer 2

您应该签出： http : //simplehtmldom.sourceforge.net/

一个例子是：

$html = new simple_html_dom();

$html = file_get_html('http://www.example.com/pages/');

$ret = $html->find('div[class=box]');

不要在Regex上浪费时间，有很多工具可以完成这项工作。

如何从网页中提取特定div的内容？

问题描述

2 个解决方案

解决方案1
2 已采纳 2012-02-06 06:42:08

解决方案2
1 2012-02-06 06:32:50

如何从网页中提取特定div的内容？

问题描述

2 个解决方案

解决方案1 2 已采纳 2012-02-06 06:42:08

解决方案2 1 2012-02-06 06:32:50

解决方案1
2 已采纳 2012-02-06 06:42:08

解决方案2
1 2012-02-06 06:32:50