简体   繁体   中英

foreach loop running through curl function

I've been scratching my head for days over this stupid one.

I have an array of urls called $url_array pulled from the database like so -

Array (
    [id] => 2
    [url] => http://example.com
)

I have foreach loop which runs over $url_array and scrapes the url for data like so -

foreach ($url_array as $row) {
    $data = $this->scrapePage($row["url"]);
    print_r($data);
    return false;
}

Currently $data is outputting nothing. But if I replace $row["url"] with http://example.com , the scrape happens correctly.

This is the first time I've also hosted this script on DigitalOcean so I'm not sure if there are any server technicalities possibly stopping a foreach loop from working.

edit: Here is the scrapePage function -

private function scrapePage($url) {
    $ch = curl_init($url);

    curl_setopt($ch, CURLOPT_COOKIESESSION, true);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_HTTPHEADER, array('Accept-Charset: utf-8'));
    curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($ch, CURLOPT_VERBOSE, true);

    $content = curl_exec($ch);
    $header = curl_getinfo($ch);
    curl_close($ch);

    return array("header" => $header, "content" => $content);
}

Like I said, if I manually enter a url in there, it works fine, just not when in a loop.

As for the $url_array, this is the output when I print it out -

Array
(
    [0] => Array
        (
            [id] => 41
            [url] => http://www.example1.com
        )

    [1] => Array
        (
            [id] => 85
            [url] => http://test-url-2.com
        )
)

I've also tried a for loop over the data. If I modify the scrapePage function to return the $url, it returns the $url correctly.

After much headache, I've found the issue. The database of urls I had looked like this -

http://www.example1.com\r
http://www.example2.com\r
http://www.example3.com\r
http://www.example4.com\r

Note the "\\r" at the end, that was messing up cURL. I had assumed the database I was given was clean. Apparently not! I just removed all the trailing \\r's and all the code works as expected.

Your $url_array is nested, you should try following to get the urls and use your scrapePage function:

foreach ($url_array as $row => $value) {
    foreach ($value as $row => $value) {
        if($row === 'url') {
        //$urls[]=$value;
        $data = $this->scrapePage($value);
        print_r($data);
        }
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM