Skip waiting time using node-fetch

Question

I use node-fetch and cheerio to craw data from a comic website . I just use a simple code to display the body html like below:

var fetch = require('node-fetch');
var cheerio = require('cheerio');

var url = 'http://readcomiconline.to';

function getComic() {
    fetch(url)
        .then(res => res.text())
        .then(body => console.log(body));
}

getComic();

The problem is this page use a javascript code that the client need to wait in 5 second before it redirect to the main page, so I cannot crawl anything before the main pages loaded.

How can I skip this time and starting to crawl data from the pages.

Thank you.

Answer 1

Looks like you're going to need more than those 2 modules.

The website you're trying to scrape uses JS to send verification to /cdn-cgi/l/chk_jschl and get cookies. You can either use selenium or reverse the js.

More info here: Python web scraping : 503 Response with specific site (how come?)

Answer 2

You don't need wait 5s, because it will run in browser.

You have form #challenge-form , use cheerio to get url , method and data(value of input) of form, and request it (save cookie).

You can use devtool (chrome, or something like that check form of request in browser).

This is project I try to login facebook : index.js , it may be help you.

Skip waiting time using node-fetch

Question

2 answers

solution1
2 2018-06-10 12:29:08

solution2
0 2018-07-17 16:44:37

Skip waiting time using node-fetch

Question

2 answers

solution1 2 2018-06-10 12:29:08

solution2 0 2018-07-17 16:44:37

solution1
2 2018-06-10 12:29:08

solution2
0 2018-07-17 16:44:37