简体   繁体   中英

Curl with PHP to scrape a website not working

I am trying to call Citi Bank's open API ( https://developer.citi.com/ ) and this require me to scrape the screen to allow the user to login with his username and password.

This works if I simply put this URL with parameters in the browser.

https://sandbox.apihub.citi.com/gcb/api/authCode/oauth2/authorize?response_type=code&client_id=<my_client_id>&scope=pay_with_points&countryCode=SG&businessCode=GCB&locale=en_SG&state=12093&redirect_uri=<my_callback>

However, when I attempt to make the same call from my PHP codes with curl, it returns status code of 503.

<?php

$header = array();
$header[] = 'Upgrade-Insecure-Requests: 1';
$header[] = 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8';
$header[] = 'Accept-Encoding: gzip, deflate, br';
$header[] = 'Accept-Language: en-US,en;q=0.8,ja;q=0.6,zh-CN;q=0.4,zh;q=0.2,zh-TW;q=0.2,th;q=0.2';

$ch = curl_init(); 
curl_setopt($ch, CURLOPT_URL, 'https://sandbox.apihub.citi.com/gcb/api/authCode/oauth2/authorize?response_type=code&client_id=<my_client_id>=pay_with_points&countryCode=SG&businessCode=GCB&locale=en_SG&state=12093&redirect_uri=<my_callback_url>');
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36');
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch, CURLOPT_AUTOREFERER, true); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_ENCODING, '');
curl_setopt($ch, CURLOPT_TIMEOUT, 20);
$result = curl_exec($ch);
curl_close ($ch);
echo $result;
?>

I have attempted to change my request headers so that it is just like how it would look if I had entered it as a URL in my browser.

I must have missed out something that I need to configure in curl.

Would anyone have some idea? Thank you!

The issue might be due to https. There are few options available.

1. You can download the https://curl.haxx.se/ca/cacert.pem file and save it and then add this option

curl_setopt($ch, CURLOPT_CAINFO, "/path/to/cacert.pem");

2. You can download the certificate from the browser by going to that site and do the same as above. You might run into problems if they change their certificates, need to confirm with them.

3. This is not recommended but can be used temporarily for debugging purposes to find out, if this is the actual problem. It introduces MIMT attacks.

//Only use for debugging purposes.
curl_setopt ($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt ($ch, CURLOPT_SSL_VERIFYPEER, 0); 

您可以使用 header('location: '.$url);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM