简体   繁体   English

如何使用 php 从远程 HTML 页面检索特定元素和属性?

[英]How to use php to retrieve the particular element and attribute from a remote HTML page?

How to use php to retrieve the particular element and attribute from a remote HTML page?如何使用 php 从远程 HTML 页面检索特定元素和属性?

For instance, if the element and attribute to be retrieved had the format:例如,如果要检索的元素和属性具有以下格式:

<a href="/dir/someid/" class="ccc">

Any help would be greatly appreciated.任何帮助将不胜感激。

The code method that will be used:将使用的代码方法:


<?php
   $file = fopen ("http://www.example.com/", "r");
   if (!$file) {
       echo "<p>Unable to open remote file.\n";
       exit;
   }
   while (!feof ($file)) {
       $line = fgets ($file, 1024);
       /* This only works if the title and its tags are on one line */
       if (preg_match ("@\<title\>(.*)\</title\>@i", $line, $out)) {
           $title = $out[1];
           break;
       }
   }
   fclose($file);
   ?>

Solution:解决方案:

        $homepage = file_get_contents ("https://www.somedomain.com");
        $doc = new DOMDocument;
        $doc->preserveWhiteSpace = false;
        @$doc->loadHTML($homepage);
        $xpath = new DOMXpath($doc);
        $results = $xpath->query("//div[@class='some-class']");

        foreach($results as $contextNode) {

            $text = $xpath->evaluate("string(./a[1])",$contextNode);
            $href = $xpath->evaluate("string(./a[1]/@href)",$contextNode);

            }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM