[英]HTML DOM Document parsing
i am new to DOM Document.. i have this html: 我是DOM Document的新手..我有这个HTML:
<tr class="calendar_row" data-eventid="39657">
<td class="alt1 eventDate smallfont" align="center">Sun<div class="eventday_multiple">Dec 9</div></td>
<td class="alt1 smallfont" align="center">3:34am</td>
<td class="alt1 smallfont" align="center">USD</td>
</tr>
<tr class="calendar_row" data-eventid="39658">
<td class="alt1 eventDate smallfont" align="center">Sun<div class="eventday_multiple">Dec 10</div></td>
<td class="alt1 smallfont" align="center">5:14am</td>
<td class="alt1 smallfont" align="center">EUR</td>
</tr>
i am trying to get first the contents inside the tr's using this code: 我想使用此代码首先获取tr内容:
$ret = array();
libxml_use_internal_errors(true);
$doc = new DOMDocument();
$doc->loadHTML($html);
//$doc->saveHTMLFile('textbox.php');
$text = $doc->getElementsByTagName('tr');
foreach ($text as $tag){
$ret[] = $doc->saveHtml($tag);
echo $doc->saveHtml($tag);
}
i dont know why the value being echoed was the whole document and not the values inside the tr's.. 我不知道为什么被回应的价值是整个文件,而不是tr里面的价值。
second, i would like also to get the values in between those td tags like 5:14 AM,EUR,etc. 第二,我还希望获得这些td标签之间的值,如5:14 AM,EUR等。 but i dont have any idea how to do that. 但我不知道该怎么做。
Pardon for noob question.. 原谅noob问题..
Best Regards 最好的祝福
$doc = new DOMDocument();
$doc ->loadHTML("$html");
$tables = $doc->getElementsByTagName('table');
$table = $tables->item(0);//takes the first table in dom
foreach ($table->childNodes as $td) {
if ($td->nodeName == 'td') {
echo $td->nodeValue, "\n";
}
}
Passing an element to saveHtml
generates the elements outerHTML not its innerHTML, so you get its tag attributes and all its content. 将元素传递给saveHtml
生成元素outerHTML而不是其innerHTML,因此您可以获取其标记属性及其所有内容。 Of course you need to be running PHP>=5.3.6 . 当然你需要运行PHP> = 5.3.6。
The values between the td can be obtained by $td->firstChild->nodeValue;
td之间的值可以通过$td->firstChild->nodeValue;
or just $td->textContent;
或者只是$td->textContent;
where $td
is the <td>
in question. 其中$td
是有问题的<td>
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.