简体   繁体   中英

How to scrape images from a site using javascript?

I am attempting to write a script in javascript to scrape images from a site and save them to my computer.

I have managed to make the script isolate the image tag that contains the image I want using jQuery. So I have a jquery selection:

<img src="sourceofimage.com/path/img">

My question is how can I now save this image to my computer?

I tried searching but all the results I got were about doing things like making a download button or other user facing tasks. To be clear, I will be the only one running this script and it will be run by pasting it into the console.

I only want a way to programmatically download the image and set its filename once jQuery has isolated it. Is this possible?

Edit: Can somebody kindly explain why this is receiving so many downvotes?

Try the fs library

fs.writeFile('logo.png', imagedata, 'binary', function(err){
        if (err) throw err
        console.log('File saved.')

This would apply to every single image on your page which is the direct child of an anchor, but you could use:

'$('a > img').each(function(){
  var $this = $(this);
  $this.parent('a').attr('href', $this.attr('src'));
  });

But it would do the job.

Only thing is though, users with JS disabled will see an anchor with an empty href. The following would achieve the same end result with the added benefit of simplifying your code (cleaner HTML) and adding graceful degradation:

'<img src="folio/1.jpg" class="downloadable" />
 $('img.downloadable').each(function(){
  var $this = $(this);
 $this.wrap('<a href="' + $this.attr('src') + '" download />')

Within a web browser it basically can't be done you can't write directly to the file system (may be possible with browser extensions however I haven't looked at this in a while).

Using node there's nothing stopping you doing something like:

  • Use http to retrieve the HTML
  • Use jQuery to parse the html - something like $(html).find('img');
  • Generate a http request to each image to download them
  • Save it to disk using fs

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM