简体   繁体   English

我应该怎么做才能获取cURL中的所有http链接

[英]what should i do for get all http links in cURL

I created a program in php using CURL, in which i can take data of any site and can display it in the browser. 我使用CURL在php中创建了一个程序,在其中可以获取任何站点的数据并可以在浏览器中显示它。 Another part of the program is that the data can be saved in the file using file handling and after saving this data, I can find all the http links within the body tag of the saved file. 该程序的另一部分是可以使用文件处理将数据保存在文件中,并且在保存该数据之后,我可以在已保存文件的body标签内找到所有http链接。 My code is showing all the sites in the browser which I took, but I can not find all http links 我的代码显示了我使用的浏览器中的所有站点, 但是找不到所有http链接

Kindly help me out this problem. 请帮我解决这个问题。

PHP Code: PHP代码:

<!DOCTYPE html>
<html>
    <head>
        <title>Display links using Curl</title>
    </head>
    <body>
        <?php
            $GetData = curl_init();
            $url = "http://www.ucertify.com/";
            curl_setopt($GetData, CURLOPT_URL, $url);
            curl_setopt($GetData, CURLOPT_RETURNTRANSFER, 1);
            $data = curl_exec($GetData);
            curl_close($GetData);
            $file=fopen("content.txt","w");
            fputs($file,$data);
            fclose($file);
            echo $data;
            function links() {
                $file_content = file_get_contents("http://www.ucertify.com/");
                $dom_obj = new DOMDocument();
                @$dom_obj->loadHTML($file_content);
                $xpath = new DOMXPath($dom_obj);
                $links_href = $xpath->evaluate("/html/body//a");
                for ($i = 0; $i<$links_href->length; $i++) {
                    $href = $links_href->item($i);
                    $url = $href->getAttribute("href");
                    if(strstr($url,"#")||strstr($url,"javascript:void(0)")||$url=="javascript:;"||$url=="javascript:"){}
                    else {
                        echo "<div>".$url."<div/>";
                    }
                }
            }
            echo links();
        ?>
    </body>
</html>

You can use regex like this 您可以像这样使用正则表达式

preg_match("/<body[^>]*>(.*?)<\/body>/is", $file_data, $body_content);
preg_match_all("/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i",$body_content[1],$matches);
 foreach($matches[0] as $d) {
    echo $d."<br>";
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 当我在 PHP 中使用 cURL 得到错误“连接后来自代理的 HTTP 403”时,我该怎么办? - What should I do when I get error "HTTP 403 from proxy after CONNECT" using cURL in PHP? 我应该使用MySQL获取所有图像还是使用cURL进行显示? - Should I get all images with MySQL or show it with help of cURL? 我应该怎么做才能让所有用户都拥有两个角色? 因为用户 1 在数据库中有 2 个角色 - What should i do to get all the users with exactly two roles? as user 1 has 2 roles in database PHP如何在链接中保留所有GET变量? - PHP How do I retain all GET vars in links? 当我与TOR一起使用时,CURL不会返回任何内容,该怎么办? - CURL is not return anything when I use with TOR.What should I do? 使用 cURL 获取网站中的所有链接(不仅仅是页面) - Using cURL to get all links in a website (not only the page) 在页面的所有链接(例如蜘蛛)上使用CURL获取特定内容 - Get specific content with CURL on all links in a page (like a spider) php //我应该怎么显示所有课程? - php // what should i do to display all courses? cURL会在什么时候自动填充HTTP标头,如何获得它们? - At what point does cURL automatically populate HTTP headers and how can I get them? 如何在PHP中格式化HTTP请求的curl? - How should I format a curl for a HTTP request in PHP?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM