PHP: How to scrape content of the website based on Javascript

Question

I'm trying to get content of this website using PHP simplehtmldom library.

http://www.immigration.govt.nz/migrant/stream/work/workingholiday/czechwhs.htm "

It is not working, so i tried using CURL:

function curl_get_file_contents($URL)
{
    $c = curl_init();
    curl_setopt($c, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($c, CURLOPT_URL, $URL);
    $contents = curl_exec($c);
    curl_close($c);

    if ($contents) return $contents;
    else return FALSE;
}

But always get only respose with some JS code and content:

<noscript>Please enable JavaScript to view the page content.</noscript>

Is any possibility to solve this using PHP? I must use PHP in this case so i need to simulate JS based browser.

Many thanks for any advice.

Answer 1

I must use PHP in this case so i need to simulate JS based browser.

I'd recommend you two ways:

Leverage v8js php plugin to deal with site's js when scraping. See here an usage example.
Simulate JS based browser thru using Selenium , iMacros or webRobots.io Chrome ext. But in this case you are off the PHP scripting.

PHP: How to scrape content of the website based on Javascript

Question

1 answers

solution1
2 ACCPTED 2015-03-08 21:11:58

PHP: How to scrape content of the website based on Javascript

Question

1 answers

solution1 2 ACCPTED 2015-03-08 21:11:58

solution1
2 ACCPTED 2015-03-08 21:11:58