简体   繁体   English

Node.js异步下载多个文件

[英]Node.js Downloading multiples files asynchronously

In trying to get a hang of node.js asynchronous coding style, I decided to write a program that would read a text file containing a bunch of URLS to download and download each file. 为了尝试使用node.js异步编码风格,我决定编写一个程序,该程序将读取包含一堆URL的文本文件,以下载和下载每个文件。 I started out writing a function to download just one file (which works fine), but having trouble extending the logic to download multiple files. 我开始编写一个仅下载一个文件的功能(效果很好),但是无法扩展下载多个文件的逻辑。

Here's the code: 这是代码:

var http     = require("http"),
    fs       = require("fs"),
    input    = process.argv[2],
    folder   = "C:/Users/Wiz/Downloads/",
    regex    = /(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?/,
    urls     = null,
    url      = "",
    filename = "";

fs.readFile(input, "utf8", function(e, data) {
    console.log("Reading file: " + input);
    if (e) console.log("Got error:" + e.message);
    urls = data.split("\n");
    for (var i = urls.length; i--;) {
        url = urls[i];
        if (!url.match(regex)) continue;
        filename = folder + url.substring(url.lastIndexOf('/') + 1);
        downloadQueue.addItem(url, filename);
    }
});

var downloadQueue = {
    queue: [],
    addItem: function(p_sSrc, p_sDest) {
        this.queue.push({
            src: p_sSrc,
            dest: p_sDest
        });
        if (this.queue.length === 1) {
            this.getNext();
        }
    },
    getNext: function() {
        var l_oItem = this.queue[0];
        http.get(l_oItem.src, function(response) {
            console.log("Downloading: " + l_oItem.dest);
            var file = fs.createWriteStream(l_oItem.dest);
            response.on("end", function() {
                file.end();
                console.log("Download complete.");
                downloadQueue.removeItem();
            }).on("error", function(error) {
                console.log("Error: " + error.message);
                fs.unlink(l_oItem.dest);
            });
            response.pipe(file);
        });
    },
    removeItem: function() {
        this.queue.splice(0, 1);
        if (this.queue.length != 0) {
            this.getNext();
        } else {
            console.log("All items downloaded");
        }
    }
};

How do I structure the code so that the completion of the first download can signal the initiation of the next one. 我如何构造代码,以便第一次下载完成可以发出下一个信号的开始。 Please note that this exercise is just for learning purposes, to understand how asynchronous coding works. 请注意,此练习仅用于学习目的,以了解异步编码的工作原理。 In practice, I'm sure there are much better tools out there to download multiple files. 实际上,我敢肯定有更好的工具可以下载多个文件。

Try simple at first, it look like you copy paste codes and quite don't understand what they do. 首先尝试简单,它看起来像您复制粘贴代码,但完全不了解它们的作用。

Do a simple loop, that get the url, and print something. 做一个简单的循环,获取URL,然后打印一些内容。

var http = require('http');

URL = require('url').parse('http://www.timeapi.org/utc/now?format=%25F%20%25T%20-%20%25N')
URL['headers'] = {'User-Agent': 'Hello World'}


// launch 20 queries asynchronously
for(var i = 0; i < 20; i++) {
  (function(i) {
    console.log('Query ' + i + ' started');
    var req = http.request(URL, function(res) {
      console.log('Query ' + i + ' status: ' + res.statusCode + ' - ' + res.statusMessage);
      res.on('data', function(content){
        console.log('Query ' + i + ' ended - ' + content);
      });
    });

    req.on('error', function(err) {
      console.log('Query ' + i + ' return error: ' + err.message);
    });

    req.end();
  })(i);
}

All the urls will be fetched asynchronously. 所有网址将以异步方式获取。 You can observe that the response does not arrive in order, but are still processed correctly. 您可以观察到响应未按顺序到达,但仍得到正确处理。

The difficulty with async is not to do the things is parallel, because you just write like a single task, and execute multiple time. 异步的困难在于不做并行的事情,因为您就像一个任务一样编写,并执行多次。 It becomes complicated when you need for instance to wait for all tasks to finished before continuing. 例如,当您需要等待所有任务完成然后继续时,它变得很复杂。 And for that, have a look at promises 为此,请看一下诺言

Here is what I started out with. 这是我开始的。 Figuring that each download was invoked asynchronously, they would all be independent of each other. 确定每个下载都是异步调用的,它们将彼此独立。

var http     = require("http"),
    fs       = require("fs"),
    input    = process.argv[2],
    folder   = "C:/Users/Wiz/Downloads/",
    regex    = /(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?/,
    urls     = null,
    url      = "",
    filename = "";

fs.readFile(input, "utf8",
  function(e, data) {
    console.log("Reading file: " + input);
    if (e) console.log("Got error:" + e.message);
    urls = data.split("\n");
    for (var i = urls.length; i--;) {
      url = urls[i];
      if (!url.match(regex)) continue;
      filename = folder + url.substring(url.lastIndexOf('/') + 1);
      http.get(url, function(response) {
                      var file =  fs.createWriteStream(filename);
                      response.on("end", function() {
                        file.end();
                      });
                      response.pipe(file);
                    })
    }
  });

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM