简体   繁体   English

PHP中的cURL无法检索响应

[英]cURL in PHP not retrieving a response

I'm trying to load a page from another website in PHP so that I can scrape its content. 我正在尝试从PHP的另一个网站加载页面,以便可以抓取其内容。 This works with pretty much any other web page, but for some reason it doesn't work with this one: 这几乎可以与任何其他网页一起使用,但是由于某些原因,它不适用于此网页:

http://www.bkstr.com/webapp/wcs/stores/servlet/CourseMaterialsResultsView?catalogId=10001&categoryId=9604&storeId=10161&langId=-1&programId=562&termId=100022286&divisionDisplayName=Stanford&departmentDisplayName=CS&courseDisplayName=103&sectionDisplayName=01 http://www.bkstr.com/webapp/wcs/stores/servlet/CourseMaterialsResultsView?catalogId=10001&categoryId=9604&storeId=10161&langId=-1&programId=562&termId=100022286&divisionDisplayName=Stanford&departmentDisplayName=CS&courseDisplayName=103&sectionDisplayName=01

Anybody know why? 有人知道为什么吗? Is it a firewall or something? 是防火墙还是什么? Or know of another way to go about doing this? 还是知道执行此操作的另一种方法? Even in another language? 即使是另一种语言?

Here's the cURL code I'm using: 这是我正在使用的cURL代码:

$ch = curl_init();
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $theurl);
$response = curl_exec($ch);
curl_close($ch);

I've tried these cURL options: 我已经尝试过以下cURL选项:

curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); 
curl_setopt($ch, CURLOPT_TIMEOUT, 5);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); 
curl_setopt($ch, CURLOPT_PORT , *ports 22 and 433*);

** Know of any other ports to try? **知道要尝试的其他端口吗? Or a way to figure out which port the host is using? 还是找出主机使用哪个端口的方法? I'm trying to loop thru possible ports right now. 我正在尝试通过可能的端口循环。

I've tried getting the info and here's what I've got: 我尝试获取信息,这是我得到的:

$info = curl_getinfo($ch);
print_r($info);

returns 退货

Array ( [url] => http://www.bkstr.com/webapp/wcs/stores/servlet/CourseMaterialsResultsView?catalogId=10001&categoryId=9604&storeId=10161&langId=-1&programId=562&termId=100022286&divisionDisplayName=Stanford&departmentDisplayName=CS&courseDisplayName=103§ionDisplayName=01 [content_type] => [http_code] => 0 [header_size] => 0 [request_size] => 289 [filetime] => -1 [ssl_verify_result] => 0 [redirect_count] => 0 [total_time] => 0.602861 [namelookup_time] => 0.226121 [connect_time] => 0.285047 [pretransfer_time] => 0.285149 [size_upload] => 0 [size_download] => 0 [speed_download] => 0 [speed_upload] => 0 [download_content_length] => 0 [upload_content_length] => 0 [starttransfer_time] => 0.602824 [redirect_time] => 0 )

Thanks a bunch! 谢谢一群!

I realize now that the web admins must not have enabled CORS. 我现在意识到,网络管理员必须未启用CORS。 To scrape the page I wrote a Java bot that loaded the page in my browser and saved it to a file. 为了抓取页面,我编写了一个Java机器人,该机器人在浏览器中加载了页面并将其保存到文件中。 Messy but it ultimately worked... 混乱,但最终成功了...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM