简体   繁体   中英

PHP - Scraping javascript websites

I am trying to check the content of websites using php curl for any blacklisted keywords. But with curl I will not get the content generated by JS scripts. I have to scan thousands of websites and efficiency is the main point here. I need to get js content. So far I came across Phantomjs-php. Are there any other js scripts which will take less resources that will work with php. I just need to get the html content. Any insights on this is much appreciated as I am new to getting JS generated content.

Thank you Lynn

I'm pretty sure that Codeception would do the trick for you.

You can configure it to work with a headless browser, just like phantom.js and puppeteer, and see your js generated content. A sample acceptance testing, which is what you want to do, would look like this:

$I->amOnPage('/login');
$I->fillField('username', 'davert');
$I->fillField('password', 'qwerty');
$I->click('LOGIN');
$I->see('Welcome, Davert!');

taken from: https://codeception.com/docs/03-AcceptanceTests

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM