简体   繁体   中英

How can I GET content of a HTTPS webpage?

I want to get the content of a webpage by running javascript code on NodeJs . I want the content to be exactly the same as what I see in the browser.

This is the URL : https://www.realtor.ca/Residential/Single-Family/17219235/2103-1185-THE-HIGH-STREET-Coquitlam-British-Columbia-V3B0A9

I use the following code but I get 405 in response .

var fs = require('fs');
var link = 'https://www.realtor.ca/Residential/Single-Family/17219235/2103-1185-THE-HIGH-STREET-Coquitlam-British-Columbia-V3B0A9';
var request = require('request');
request(link, function (error, response, body) {
    fs.writeFile("realestatedata.html", body, function(err) {
        if(err) {
            console.log('error in saving the file');
            return console.log(err);
        }
        console.log("The file was saved!");
    });
})

The file which is saved is not related to what I can see in the browser.

I think a real answer will be easier to understand since my comment was truncated.

It seems the method of the request you send is not supported by the server (405 Method Not Allowed - The method specified in the Request-Line is not allowed for the resource identified by the Request-URI. The response MUST include an Allow header containing a list of valid methods for the requested resource.). Do you have more information about the HTTP response. Have you tried the following code instead of yours ?

request('https://www.realtor.ca/Residential/Single-Family/17219235/2103-1185-THE-HIGH-STREET-Coquitlam-British-Columbia-V3B0A9').pipe(fs.createWriteStream('realestatedata.html')) 

You could also have a look at In Node.js / Express, how do I "download" a page and gets its HTML? .

Note that anyway the page will not render the same way when you only open the html since it also requires many other resources (110 requests are done when display the page). I think the following answer can help you to download the whole page. https://stackoverflow.com/a/34935427/1630604

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM