[英]How do I aggregate promises generated from async functions within a Node.js stream callback?
I have a Node.js Typescript program in which I'm trying to parse large CSV files line by line and do something with those lines asynchronously. 我有一个Node.js Typescript程序,我试图在其中逐行解析大型CSV文件,并异步地对这些行进行处理。 More specifically, I need a function that will:
更具体地说,我需要一个函数来:
Some requirements and considerations: 一些要求和注意事项:
Here's some test code I've gotten working. 这是我已经开始工作的一些测试代码。
ObjectStream
is a custom Node.js Transform that converts CSV lines to objects. ObjectStream
是一个自定义Node.js转换,可将CSV行转换为对象。
function parseFileAsync(filePath: string): Promise<any> {
var doParseFileAsync = (filePath: string) => {
var streamDeferred = q.defer<Promise<any>[]>();
var promises: Promise<any>[] = [];
var propertyNames: string[] = [];
var stream = fs.createReadStream(filePath, { encoding: "utf8" })
.pipe(new LineStream({ objectMode: true }))
.pipe(new ObjectStream({ objectMode: true }));
stream.on("readable", () => {
var obj: Object;
while ((obj = stream.read()) !== null) {
console.log(`\nRead an object...`);
var operationDeferred = q.defer<any>();
operationDeferred.resolve(doSomethingAsync(obj));
promises.push(operationDeferred.promise);
}
});
stream.on("end", () => {
streamDeferred.resolve(promises);
});
return streamDeferred.promise;
}
return doParseFileAsync(filePath)
.then((result: Promise<any>[]) => {
return q.all(result);
});
}
parseFileAsync(filePath)
.done((result: any[]) => {
console.log(`\nFinished reading and processing the file:\n\t${result.toString()}`);
});
The final done
call is executed before the stream even starts running, because parseFileAsync
immediately fulfills with an empty array; 最后的
done
调用在流甚至开始运行之前就执行了,因为parseFileAsync
立即用一个空数组满足; the stream hasn't had a chance to push any promises yet. 该信息流还没有机会兑现任何承诺。
After days of searching, I'm still not sure what the correct way to do this is. 经过几天的搜索,我仍然不确定执行此操作的正确方法是什么。 Node/JavaScript experts: help?
节点/ JavaScript专家:帮助吗?
The code has been updated, and my promises are now playing nicely. 代码已更新,我的诺言现在运行良好。 However, I need a way to hook into the stream and cancel the process if desired.
但是,我需要一种挂接到流中并在需要时取消进程的方法。 I also need a way to retry any operations that failed.
我还需要一种方法来重试任何失败的操作。
I was running into some limitations in the program's architecture that wouldn't allow me to pass promises around as freely as I wanted. 我在程序的体系结构中遇到了一些限制,这些限制使我无法随意地实现承诺。 So instead, rather than kicking off a bunch of promises, I decided to wait until the previous batch finishes before starting on a new one.
因此,我决定没有等到许诺,而是决定等到前一批完成后再开始新的承诺。 Here's the approach I took:
这是我采取的方法:
Separate the stream stuff into its own function that accepts continuation tokens. 将流内容分成接受连续令牌的自己的函数。 The return value will contain the data read as well as a continuation token if there's more data to be read:
如果要读取的数据更多,则返回值将包含读取的数据以及延续令牌:
function readFile(filepath: string, lines: number, start: any): Promise<any> { ... }
Define a function that will run the retry-able operation. 定义一个将运行可重试操作的函数。 Within the body of this function, retrieve and process a chunk of data from the file.
在此函数的主体内,从文件中检索和处理大量数据。 If the result has a continuation token, "recursively" call the operation function again:
如果结果具有延续令牌,请再次“递归”调用操作函数:
function processFile(filepath: string, next: any): Promise<any> { var chunkSize = 1; return readLines(filepath, chunkSize, next) .then((result) => { // Do something with `result.lines` ... if (result.next) { return parseFile(filepath, result.next); } }); }
And voila! 瞧! A long-running operation that operates on chunks and is easy to report progress on.
一项长时间运行的操作,对块进行操作并且很容易报告进度。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.