[英]what should i do for get all http links in cURL
I created a program in php using CURL, in which i can take data of any site and can display it in the browser. 我使用CURL在php中创建了一个程序,在其中可以获取任何站点的数据并可以在浏览器中显示它。 Another part of the program is that the data can be saved in the file using file handling and after saving this data, I can find all the http links within the body tag of the saved file.
该程序的另一部分是可以使用文件处理将数据保存在文件中,并且在保存该数据之后,我可以在已保存文件的body标签内找到所有http链接。 My code is showing all the sites in the browser which I took, but I can not find all http links
我的代码显示了我使用的浏览器中的所有站点, 但是找不到所有http链接
Kindly help me out this problem. 请帮我解决这个问题。
PHP Code: PHP代码:
<!DOCTYPE html>
<html>
<head>
<title>Display links using Curl</title>
</head>
<body>
<?php
$GetData = curl_init();
$url = "http://www.ucertify.com/";
curl_setopt($GetData, CURLOPT_URL, $url);
curl_setopt($GetData, CURLOPT_RETURNTRANSFER, 1);
$data = curl_exec($GetData);
curl_close($GetData);
$file=fopen("content.txt","w");
fputs($file,$data);
fclose($file);
echo $data;
function links() {
$file_content = file_get_contents("http://www.ucertify.com/");
$dom_obj = new DOMDocument();
@$dom_obj->loadHTML($file_content);
$xpath = new DOMXPath($dom_obj);
$links_href = $xpath->evaluate("/html/body//a");
for ($i = 0; $i<$links_href->length; $i++) {
$href = $links_href->item($i);
$url = $href->getAttribute("href");
if(strstr($url,"#")||strstr($url,"javascript:void(0)")||$url=="javascript:;"||$url=="javascript:"){}
else {
echo "<div>".$url."<div/>";
}
}
}
echo links();
?>
</body>
</html>
You can use regex like this 您可以像这样使用正则表达式
preg_match("/<body[^>]*>(.*?)<\/body>/is", $file_data, $body_content);
preg_match_all("/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i",$body_content[1],$matches);
foreach($matches[0] as $d) {
echo $d."<br>";
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.