简体   繁体   English

我如何使用 curl 来获取这个 url

[英]How can i use curl for fetch this url

I am trying to get an a tag using crl from a website but it seems not working.我正在尝试使用来自网站的 crl 获取 a 标签,但它似乎不起作用。 It's working fine with other websites but it's not working with this website:它在其他网站上运行良好,但不适用于本网站:

sbplay1.c০m sbplay1.c০m

How can i make it work?我怎样才能让它工作?

<?php
//$url="https://google.com";
$url= "https://sbplay1.com";
$ch = curl_init();
    curl_setopt($ch, CURLOPT_COOKIE, 'viewport=1040; _flashVersion=1');
    curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-type: application/json ', 'Accept: *'));   
    curl_setopt($ch,CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36');
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    @curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$html = curl_exec($ch); 
$dom = new DOMDocument;
$dom->loadHTML($html);
$node = $dom->getElementsByTagName('a')->item(3);
$ids = $node->getAttribute("href");
echo $ids;
?>

This would be because the url that you're trying to reach would generate a single page application(SPA).这是因为您尝试访问的 url 会生成一个单页应用程序 (SPA)。 These applications execute javascript to render the information that you are searching for on the page.这些应用程序执行 javascript 来呈现您在页面上搜索的信息。 The reason as to why curl does not have this information is because it is not a browser and therefore cannot execute javascript. curl 没有此信息的原因是因为它不是浏览器,因此无法执行 javascript。 You can use something like Selenium to browse the page after js rendering. js渲染后可以使用Selenium之类的东西来浏览页面。

A popular crawler that I've used in the past to read SPA pages in PHP is Spatie.我过去用来在 PHP 中读取 SPA 页面的流行爬虫是 Spatie。

https://github.com/spatie/crawler https://github.com/spatie/crawler

You can tell spatie to crawl all pages and render them as if using a browser.您可以告诉 spatie 抓取所有页面并像使用浏览器一样呈现它们。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM