简体   繁体   English

使用DOM从网址中提取HTML

[英]Extract html from url with DOM

I've already search about this but most of the topics used java language, but i need using DOM in PHP. 我已经搜索过此内容,但是大多数主题都使用Java语言,但是我需要在PHP中使用DOM。 I wanna extract this element from example.com : 我想从example.com中提取此元素:

<div id="download" class="large-12  medium-12  columns hide-for-small-only">
  <a href="javascript:void(0)" link="https://mediamusic.com/media/mp3/mp3-256/Mas.mp3" target="_blank" class="mp3_download_link">
  <i class="fa fa-cloud-download">Download Now</i>
  </a>
</div>

How can i get mp3_download_link class from this code using DOM in PHP! 我如何在PHP中使用DOM从此代码获取mp3_download_link类! as i said i have already search about this but really i confused... 正如我所说的,我已经对此进行了搜索,但实际上我很困惑...

You can use library to parsing DOM. 您可以使用库来解析DOM。 For example: https://github.com/tburry/pquery 例如: https//github.com/tburry/pquery

Usage: 用法:

$dom = pQuery::parseStr($html);
$class = $dom->query('#download a')->attr('class');

You can try file_get_html to parse html 您可以尝试使用file_get_html解析html

$html=file_get_html('http://demo.com');

and use the below to get all the attributes of anchor tag. 并使用以下内容获取锚标记的所有属性。

foreach($html->find('div[id=download] a') as $a){
  var_dump($a->attr);
}

Let's assume you have this DOM as a string. 假设您将此DOM作为字符串。 Then you may use built-in DOM extension to get link you need. 然后,您可以使用内置的DOM扩展来获取所需的链接。 Here is the example of a code: 这是代码示例:

$domstring = '<div id="download" class="large-12  medium-12  columns hide-for-small-only">
  <a href="javascript:void(0)" link="https://mediamusic.com/media/mp3/mp3-256/Mas.mp3" target="_blank" class="mp3_download_link">
    <i class="fa fa-cloud-download">Download Now</i>
  </a>
</div>';
$links = array();
$dom = new DOMDocument('1.0', 'utf-8');
$dom->loadHTML($domstring);//here $domstring is a string containing html you posted in your question
$node_list = $dom->getElementsByTagName('a');

foreach ($node_list as $node) {
  $links[] = $node->getAttribute('link');
}

print_r(array_shift($links));

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM