Scraping website with node.js request and getting weird characters

Question

I used nwjs (ver 0.18.8) and I made a request on mangafox.me to do a mangareader.

It works with http://mangafox.me/directory/

When I try to make a request on a manga image like this one http://mangafox.me/manga/onepunch_man/vTBD/c066/1.html I get these weird symbols:

{s F [ w#Y \\ AI (tY dϯ M%9 @ Cw ~ I(v ں ʑ y t k2z o y .^~wɌ e Ҳ ]?c Kf =v 0 3? y`Y _̘gY|fY \\ Q2 M nV iz g b$W _a c C5

How can I fix this?

Answer 1

Nevermind x) in fact it was just that the output was compressed in zip, so if you want to solve it if you have the same problem just add gzip: true in request header Ex:

request({url: '*****', gzip: true}, function(err, res, html){

   if (!error && response.statusCode == 200) {

   //Do something

   }

});

Answer 2

You don't need node.js for something this simple. The easiest way to scrape a site is to load it into a hidden iframe and then just loop through the document's arrays of elements you need.

The document loaded gives you everything in arrays like these...

 Frame.contentWindow.document.forms

 Frame.contentWindow.document.scripts

 Frame.contentWindow.document.styleSheets

 Frame.contentWindow.document.embeds

 Frame.contentWindow.document.cookie

 Frame.contentWindow.document.images

 Frame.contentWindow.document.links

And so forth...

Scraping website with node.js request and getting weird characters

Question

2 answers

solution1
1 2016-11-29 18:06:33

solution2
0 2016-12-09 08:28:01

Scraping website with node.js request and getting weird characters

Question

2 answers

solution1 1 2016-11-29 18:06:33

solution2 0 2016-12-09 08:28:01

solution1
1 2016-11-29 18:06:33

solution2
0 2016-12-09 08:28:01