简体繁体中英

Heisenbug with own headless browser

原文 2012-11-29 10:05:34 7 1 javascript/ c++/ webkit/ qt4/ headless-browser

I'm working on a headless browser based on WebKit (using C++/Qt4) with JavaScript support. The main purpose for this is being able to generate a HTML spanshot of websites heavily based on JavaScript (see Backbone.js or any other JavaScript MVC).

I'm aware that there isn't any way for knowing when the page is completely loaded (please see this question ) and because of that, after I get the loadFinished signal (docs here ) I create a timer and start polling the DOM content (as in checking every X ms the content of the DOM) to see if there were any changes. If there werent I assume that the page was loaded and print the result. Please keep in mind that I already know this is not-near-to-perfect solution, but it's the only one I could think of. If you have any better idea please answer this question

NOTE: The timer is non-blocking, meaning that everything running inside WebKit shouldn't be affected/blocked/paused in any way.

After testing the headless browser with some pages, everything seems to work fine (or at least as expected). But here is where the heisenbug appears. The headless browser should be called from a PHP script, which should wait (blocking call) for some output and then print it.

On my test machine (Apache 2.3.14, PHP 5.4.6) running the PHP script outputs the desired result, aka, the headless browser fetches the website, runs the JavaScript and prints what a user would see; but running the same script in the production server will fetch the website, run some of the JavaScript code and print the result.

The source code of the headless browser and the PHP script I'm using can be found here .

NOTE: The timer (as you can see in the source code of the headless browser) is set to 1s, but setting a bigger amount of time doesn't fix the problem

NOTE 2: Catching all JavaScript errors doesn't show anything, so it's not because of a missing function, wrong args, or any other type of incorrect code.

I'm testing the headless browser with 2 websites. This one is working on both my test machine and in production server, while this one works only in my test machine.

I'm more propone to think that this is some weird bug in the JavaScript code in the second website rather than in the code of the headless browser, as it generates a perfect HTML snapshot of the first website, but then again, this is a heisenbug so I'm not really sure what is causing all this.

Any ideas/comments will be appreciated. Thank you

1 answers

Rather than polling for DOM changes, why not watch network requests? This seems like a safer heuristic to use. If there has been no network activity for X ms (and there are no pending requests), then assume page is fully "loaded".

Open PDF with headless browser Phantomjs

Headless browser in Azure functions JavaScript?

React Native - Headless Browser Automation?

Headless browser image quality - Headless chrome, phantom js, slimmer js

Wordpress Upload Plugin Form Will Not Activate In Headless Browser

Alternative to headless browser on pdf generation of GCharts

Adding jQuery selector control to phantomJS headless browser

Cant read console logs in a headless browser

C#: Headless Browser with Proxy and JavaScript

onBeforeUnload not being triggered from a headless browser

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Open PDF with headless browser Phantomjs Headless browser in Azure functions JavaScript? React Native - Headless Browser Automation? Headless browser image quality - Headless chrome, phantom js, slimmer js Wordpress Upload Plugin Form Will Not Activate In Headless Browser Alternative to headless browser on pdf generation of GCharts Adding jQuery selector control to phantomJS headless browser Cant read console logs in a headless browser C#: Headless Browser with Proxy and JavaScript onBeforeUnload not being triggered from a headless browser

Related Tags

Heisenbug with own headless browser

Question

1 answers

solution1 0 2013-02-14 06:43:57

solution1
0 2013-02-14 06:43:57