简体   繁体   中英

PHP : Scrape data generated with javascript ( ES6 )

I try to scrape data of some URL with phantomjs and php phantomjs , but my target page generated some of the data with ES6 and phantomjs doesn't support it yet , and I got some errors like this ( in Console log ) :

ReferenceError: Can't find variable: Set

and my code is :

use JonnyW\PhantomJs\Client;

$client = Client::getInstance();

$client->getEngine()->setPath('C:\\Users\\XXX\\Desktop\\bin\\phantomjs.exe');

$request = $client->getMessageFactory()->createRequest('example.com', 'GET');

$response = $client->getMessageFactory()->createResponse();

$client->send($request, $response);
var_dump($response->getConsole());

I search a lot! and I found the phantomjs will support ES6 in new version ( v2.5 ) and release a beta version but it's doesn't work for me!

now, what I do? is there any way to scrape this page?

While the future of PhantomJS is not yet certain, may I suggest another headless browser to use: puppeteer . It is based on Google Chrome headless and behind it is a separate team of Google engineers.

There are already projects to control it from PHP, most notable at the moment is puphpeteer *

__
* (notable in the way that not only can it make screenshots/PDF, but it also offers javascript evaluation)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM