解析html-CURL和正则表达式

Question

如何获取文本：“文本示例上限”来自：

<td valign="top" align="left">

    <a href="/server?tree=xabaf"
    class="normal"> Text example max </a>

</td>

使用正则表达式？

include('simple_html_dom.php');
$ch = curl_init('http://www.site.com?id=325235');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$sss = curl_exec($ch);
curl_close($ch);

preg_match_all("#class="normal"?</a>$#", $sss, $arr);

Answer 1

使用REGEX的解决方案

$text = "<a href='/server?tree=xabaf' class='normal'> Text example max </a>
";
$regex_pattern = "/<a href=\"?\'?(.*)\"?\'?>(.*)<\/a>/";
preg_match_all($regex_pattern,$text,$matches);

PHP的DOM

$text = "<a href='/server?tree=xabaf' class='normal'> Text example max </a>";
$dom = new DOMDocument;
$dom->loadHTML($text);
$links = $dom->getElementsByTagName('a');
foreach ($links as $link){
    echo $link->textContent;
}

使用DOM，而不使用正则表达式。

Answer 2

由于没有其他文本，因此应用strip_tags()就足够了。

$str ='<td valign="top" align="left">

    <a href="/server?tree=xabaf"
    class="normal"> Text example max </a>

</td>';

$str = trim(strip_tags($str));

Answer 3

你可以试试这个...

include('simple_html_dom.php');

$url = 'http://www.site.com?id=325235';

$curl = curl_init(); 
curl_setopt($curl, CURLOPT_URL, $url);  
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);  
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 10);  
$str = curl_exec($curl);  
curl_close($curl);

$html = str_get_html($str);

$content = $html->find('div[class=normal]');
echo $content->innertext;

解析html-CURL和正则表达式

问题描述

3 个解决方案

解决方案1
1 已采纳 2012-06-15 12:26:25

解决方案2
0 2012-06-15 12:30:03

解决方案3
0 2012-06-15 12:30:28

解析html-CURL和正则表达式

问题描述

3 个解决方案

解决方案1 1 已采纳 2012-06-15 12:26:25

解决方案2 0 2012-06-15 12:30:03

解决方案3 0 2012-06-15 12:30:28

解决方案1
1 已采纳 2012-06-15 12:26:25

解决方案2
0 2012-06-15 12:30:03

解决方案3
0 2012-06-15 12:30:28