简体   繁体   中英

How to get html source code after javascript transformation?

for a project at school I am trying to make a website that can show your grades in a prettier way than it's being done now. I have been able to log in to the site using cURL and now I want to get the grades in a string so I can edit it with PHP. The only problem is that cURL gets the html source code when it hasn't been edited by the javascript that gets the grades.

So basically I want the code that you get when you open firebug or inspector in a string so I can edit it with php.

Does anyone have an idea on how to do this? I have seen several posts that say that you have to wait till the page has loaded, but I have no clue on how to make my site wait for another third-party site to be loaded.

The code that I am waiting to be executed and of which I want the result is this:

<script type="text/javascript">
    var widgetWrapper = $("#objectWrapper325");
    if (widgetWrapper[0].timer !== undefined) {
        clearTimeout( jQuery('#objectWrapper325')[0].timer );
    }
    widgetWrapper[0].timer = setTimeout( function() {
        if (widgetWrapper[0].xhr !== undefined) {
            widgetWrapper[0].xhr.abort();
        }
        widgetWrapper[0].xhr = jQuery.ajax({
            type: 'GET',
            url: "",
            data: {
                "wis_ajax": 1,
                "ajax_object": 325,
                'llnr': '105629'
            },
            success: function(d) {
                var goodWidth = widgetWrapper.width();
                widgetWrapper.html(d);
                /* update width, needed for bug with standard template */
                $("#objectWrapper325 .result__overview").css('width',goodWidth-$("#objectWrapper325         .result__subjectlabels").width());
            }
        });
    }, 500+(Math.random()*1000));
</script>

First you have to understand a subtle but very important difference between using cURL to get a webpage, and using your browser visiting that same page.

1. Loading a page with a browser

When you enter the address on the location bar, the browser converts the url into an ip address . Then it tries to reach the web server with that address asking for a web page. From now on the browser will only speak HTTP with the web server. HTTP is a protocol made for carrying documents over network. The browser is actually asking for an html document (A bunch of text) from the web server. The web server answers by sending the web page to the browser. If the web page is a static page, the web server is just picking an html file and sending it over network. If it's a dynamic page, the web server use some high level code (like php) to generate to the web page then send it over.

Once the web page has been downloaded, the browser will then parse the page and interprets the html inside which produces the actual web page on the browser. During the parsing process, when the browser finds script tags it will interpret their content as javascript, which is a language used in browser to manipulate the look of the web page and do stuff inside the browser.

Remember, the web server only sent a web page containing html content he has no clue of what's javascript.

So when you load a web page on a browser the javascript is ONLY interpreted once it is downloaded on the browser.

2. What is cURL

If you take a look at curl man page, you'll learn that curl is a tool to transfer data from/to servers which can speak some supported protocols and HTTP is one of them. When you download a page with curl, it will try to download the page the same way your browser does it but will not parse or interpret anything. cURL does not understand javascript or html, all it knows about is how to speak to web servers.

3. Solution

So what you need in your case is to download the page like cURL does it and also somehow make the javascript to be interpreted as if it was inside a browser.

If you had follwed me up to here then you're ready to take a look at CasperJS .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM