简体   繁体   中英

How to properly retrieve POST data through cURL and PHP?

I am trying to extract the data from this web page:

https://portal.icuregswe.org/siri/report/corona.vtfstart

I have identified that this data can be retrieved fromhttps://portal.icuregswe.org/siri/api/reports/GenerateHighChart (POST). My attempted solution in PHP using cURL is as follows:

$url = 'https://portal.icuregswe.org/siri/api/reports/GenerateHigh';
 
$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, $url); 
curl_setopt($ch, CURLOPT_POST, 1);

curl_setopt($ch, CURLOPT_HTTPHEADER, array(
    'Host: portal.icuregswe.org',
    'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:84.0) Gecko/20100101 Firefox/84.0',
    'Accept: application/json, text/plain, */*',
    'Accept-Language: sv-SE,sv;q=0.8,en-US;q=0.5,en;q=0.3',
    'Accept-Encoding: gzip, deflate, br',
    'Content-Type: application/x-www-form-urlencoded',
    'Content-Length: 179',
    'Origin: https://portal.icuregswe.org',
    'Connection: keep-alive',
    'Referer: https://portal.icuregswe.org/siri/report/corona.vtfstart',
    'Cookie: _ga=GA1.2.1861236498.1591648329; _gid=GA1.2.1182050970.1608386303',
    'Pragma: no-cache',
    'Cache-Control: no-cache'));

$output = curl_exec ($ch);

print $output;

curl_close ($ch);

I am expecting to retrieve the raw data but instead I am getting HTTP Error 400. The header is generated by accessing the given url in Firefox. Any help would be appreciated how to solve this problem.

there's no need to lie on the useragent, they don't run a user-agent blacklist; but it does require a bunch of POST parameters, and in fact, you don't need ANY of those headers (well, except the Content-Type header, but curl will add that for you automatically when setting CURLOPT_POSTFIELDS to a string, so you don't need to add it yourself, where you might risk introducing typos, while curl won't, so it's better if you don't), this

<?php

declare(strict_types=1);
$ch = curl_init();
curl_setopt_array($ch, array(
    CURLOPT_URL => 'https://portal.icuregswe.org/siri/api/reports/GenerateHighChart',
    CURLOPT_POST => 1,
    CURLOPT_POSTFIELDS => http_build_query(array(
        'highChartUrl' => '/api/reports/GenerateHighChart',
        'tableUrl' => '/api/reports/GenerateExcel',
        'chartWidth' => '900',
        'reportName' => 'corona.vtfstart',
        'startdat' => '2020-01-01',
        'stopdat' => '2020-12-19',
        'sasong' =>
        array(
            0 => '2020',
        ),
    ))
));
curl_exec($ch);

outputs:

{"ReportName":"Antal nyinskrivna vårdtillfällen med Coronavirus \\n Period ###","ChartTitle":"Antal nyinskrivna vårdtillfällen med Coronavirus ","ChartSubTitle":" Period 2020-01-01 - 2020-12-19","ChartColors":["#53a8c3","#ee5773","#f0d86e","#f6854f","#15ceac","#5773ee","#f597a9","#df7f23","#10a085","#986ef0"],"YaxisTitle":"Antal vtf","YaxisTitleColor":"","HasY2Axis":false,"Y2axisTitle":null,"Y2axisTitleColor":null,"XaxisTitle":"Datum","XaxisColors":["","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","",""],"Credits":"Detta är en originalrapport från Svenska Intensivvårdsregistret","SqlString":"<b>-- SQL-sats 1</b><br />SELECT<br />&nbsp; &nbsp; &nbsp; &nbsp;DATEADD(DAY, 0, DATEDIFF(DAY, 0, Xa.VtfStart)) AS vtfstartdag@0,<br />&nbsp; &nbsp; &nbsp; &nbsp;SUM(CASE WHEN Xa.InfEnkatSiriLabfyndId=3 THEN 1 ELSE 0 END) AS antalcorona@1,<br />&nbsp; &nbsp; &nbsp; &nbsp;COUNT(DISTINCT Xa.PersonId) AS antalpersoner@2<br />FROM<br />&nbsp; &nbsp; &nbsp; &nbsp;vueSiri AS Xa<br />WHERE<br />&nbsp; &nbsp; &nbsp; &nbsp;(Xa.InfEnkatSiriLabfyndId=3) AND ((Xa.VtfStart>='2020-01-01 00:00:00') AND (Xa.VtfStart<='2020-12-19 23:59:00'))<br />GROUP BY<br />&nbsp; &nbsp; &nbsp; &nbsp;DATEADD(DAY, 0, DATEDIFF(DAY, 0, Xa.VtfStart))<br />ORDER BY<br />&nbsp; &nbsp; &nbsp; &nbsp;vtfstartdag@0<br /><br />","ChartSeries":[{"Name":"Antal vtf","Stack":null,"Type":"column","Color":"","Data":[{"Value":1.0,"Name":"2020-03-06 00:00:00","Color":null},{"Value":1.0,"Name":"2020-03-07 00:00:00","Color":null},{"Value":1.0,"Name":"2020-03-08 00:00:00","Color":null},{"Value":1.0,"Name":"2020-03-09 00:00:00","Color":null},{"Value":3.0,"Name":"2020-03-10 00:00:00","Color":null},{"Value":1.0,"Name":"2020-03-11 00:00:00","Color":null},{"Value":3.0,"Name":"2020-03-13 00:00:00","Color":null},{"Value":7.0,"Name":"2020-03-14 00:00:00","Color":null},{"Value":6.0,"Name":"2020-03-15 00:00:00","Color":null},{"Value":6.0,"Name":"2020-03-16 00:00:00","Color":null},{"Value":3.0,"Name":"2020-03-17 00:00:00","Color":null},{"Value":15.0,"Name":"2020-03-18 00:00:00","Color":null},{"Value":13.0,"Name":"2020-03-19 00:00:00","Color":null},{"Value":20.0,"Name":"2020-03-20 00:00:00","Color":null},{"Value":19.0,"Name":"2020-03-21 00:00:00","Color":null},{"Value":33.0,"Name":"2020-03-22 00:00:00","Color":null},{"Value":45.0,"Name":"2020-03-23 00:00:00","Color":null},{"Value":44.0,"Name":"2020-03-24 00:00:00","Color":null},{"Value":43.0,"Name":"2020-03-25 00:00:00","Color":null},{"Value":45.0,"Name":"2020-03-26 00:00:00","Color":null},{"Value":40.0,"Name":"2020-03-27 00:00:00","Color":null},{"Value":34.0,"Name":"2020-03-28 00:00:00","Color":null},{"Value":51.0,"Name":"2020-03-29 00:00:00","Color":null},{"Value":40.0,"Name":"2020-03-30 00:00:00","Color":null},{"Value":44.0,"Name":"2020-03-31 00:00:00","Color":null},{"Value":60.0,"Name":"2020-04-01 00:00:00","Color":null},{"Value":55.0,"Name":"2020-04-02 00:00:00","Color":null},{"Value":54.0,"Name":"2020-04-03 00:00:00","Color":null},{"Value":49.0,"Name":"2020-04-04 00:00:00","Color":null},{"Value":56.0,"Name":"2020-04-05 00:00:00","Color":null},{"Value":59.0,"Name":"2020-04-06 00:00:00","Color":null},{"Value":58.0,"Name":"2020-04-07 00:00:00","Color":null},{"Value":62.0,"Name":"2020-04-08 00:00:00","Color":null},{"Value":52.0,"Name":"2020-04-09 00:00:00","Color":null},{"Value":45.0,"Name":"2020-04-10 00:00:00","Color":null},{"Value":61.0,"Name":"2020-04-11 00:00:00","Color":null},{"Value":45.0,"Name":"2020-04-12 00:00:00","Color":null},{"Value":59.0,"Name":"2020-04-13 00:00:00","Color":null},{"Value":56.0,"Name":"2020-04-14 00:00:00","Color":null},{"Value":45.0,"Name":"2020-04-15 00:00:00","Color":null},{"Value":51.0,"Name":"2020-04-16 00:00:00","Color":null},{"Value":54.0,"Name":"2020-04-17 00:00:00","Color":null},{"Value":43.0,"Name":"2020-04-18 00:00:00","Color":null},{"Value":42.0,"Name":"2020-04-19 00:00:00","Color":null},{"Value":37.0,"Name":"2020-04-20 00:00:00","Color":null},{"Value":52.0,"Name":"2020-04-21 00:00:00","Color":null},{"Value":64.0,"Name":"2020-04-22 00:00:00","Color":null},{"Value":44.0,"Name":"2020-04-23 00:00:00","Color":null},{"Value":66.0,"Name":"2020-04-24 00:00:00","Color":null},{"Value":37.0,"Name":"2020-04-25 00:00:00","Color":null},{"Value":33.0,"Name":"2020-04-26 00:00:00","Color":null},{"Value":38.0,"Name":"2020-04-27 00:00:00","Color":null},{"Value":40.0,"Name":"2020-04-28 00:00:00","Color":null},{"Value":35.0,"Name":"2020-04-29 00:00:00","Color":null},{"Value":53.0,"Name":"2020-04-30 00:00:00","Color":null},{"Value":30.0,"Name":"2020-05-01 00:00:00","Color":null},{"Value":43.0,"Name":"2020-05-02 00:00:00","Color":null},{"Value":36.0,"Name":"2020-05-03 00:00:00","Color":null},{"Value":33.0,"Name":"2020-05-04 00:00:00","Color":null},}

(...capped, but you get the idea)

You can just copy and paste to see the result. I am also saving cookies in case we need later.

$end_date= date('Y-m-d');//You dont have to use like this, you can write directly the date in the array below.
$start_date = date("Y-m-d", strtotime("-91 day"));
$data = array(
    'highChartUrl'=> '/api/reports/GenerateHighChart',
    'tableUrl'=> '/api/reports/GenerateExcel',
    'chartWidth'=> '900',
    'reportName'=> 'corona.vtfstart',
    'startdat'=> $start_date,
    'stopdat'=> $end_date,
    'sasong[0]'=> '2020'
);
$url = 'https://portal.icuregswe.org/siri/api/reports/GenerateHighChart';
$ch = CURL_INIT();
CURL_SETOPT($ch, CURLOPT_URL, $url);
CURL_SETOPT($ch, CURLOPT_POST, true); //Post request
CURL_SETOPT($ch, CURLOPT_PROXYTYPE,CURLPROXY_SOCKS5);
CURL_SETOPT($ch, CURLOPT_POSTFIELDS, $data); 
CURL_SETOPT($ch, CURLOPT_RETURNTRANSFER,True);
CURL_SETOPT($ch, CURLOPT_FOLLOWLOCATION,True);
CURL_SETOPT($ch, CURLOPT_COOKIEJAR, dirname(__FILE__) ."/cookie.txt");
CURL_SETOPT($ch, CURLOPT_COOKIEFILE, dirname(__FILE__) ."/cookie.txt");
CURL_SETOPT($ch, CURLOPT_FOLLOWLOCATION, true); //ALLOW REDIRECTION
CURL_SETOPT($ch, CURLOPT_CONNECTTIMEOUT,90);
CURL_SETOPT($ch, CURLOPT_TIMEOUT,90); 
$result = CURL_EXEC($ch);
echo $result;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM