简体   繁体   English

NodeJS stream 返回不完整的响应

[英]NodeJS stream returns incomplete response

I am using lento (Streaming Node.js client for Presto, the "Distributed SQL Query Engine for Big Data") for querying database.我正在使用lento (Presto 的流式 Node.js 客户端,“分布式 SQL 大数据查询引擎”)来查询数据库。 Lento's createRowStream takes sql query as a string or Buffer and returns a readable stream that yields rows. Lento 的createRowStream将 sql 查询作为字符串或缓冲区返回,并返回产生行的可读 stream。

Before returning the resulted row stream, I need to do some pre-processing on the result ( stream.pipe() does that for me) and convert it to CSV format ( csvStringify does that for me)在返回结果行 stream 之前,我需要对结果进行一些预处理( stream.pipe()为我执行此操作)并将其转换为CSV确实格式(

Once the stream ends, I resolve the promise with resolve() callback and also log the number of rows streamed.一旦 stream 结束,我使用resolve()回调解决 promise 并记录流式传输的行数。 But, compared to the number of rows streamed, the actual rows returned are less.但是,与流式传输的行数相比,返回的实际行数更少。 For example, if the log says count is 10000 (variable rowsCnt ) the number of rows returned would be close to 6000.例如,如果日志显示计数为 10000(变量rowsCnt ),则返回的行数将接近 6000。

What could be the cause for the inconsistency in number of rows returned?返回的行数不一致的原因可能是什么?

Please check below imports and code snippet:请检查以下导入和代码片段:

import csvStringify from 'csv-stringify';
import {Request, Response} from 'express';
import lento from 'lento';
import streamTransform from 'stream-transform';
async getCSVRows(res: Response, sqlQueries: sqlDto[]): Promise<void> {
    const result = [];
    const lentoClient = ... code to create instance of lento client
    for (let index = 0; index < sqlQueries.length; index++) {
        const csvStream = csvStringify({header: index == 0 ? true : false});
        queryResult = executeQuery(res, sqlQueries[index], lentoClient, csvStream)
        result.push(queryResult)
    }
    await Promise.all(result);
    res.end()
}

async executeQuery(
    res: Response, 
    sqlQuery: sqlDto, 
    lentoClient: any, 
    csvStream: csvStringify.Stringifier
): Promise<Response> {
    return new Promise(async (resolve) => {
        const rowsStream = lentoClient.createRowStream(sqlQuery.query); 
        let rowsCnt = 0;
        rowsStream.on('data', function() {})
        .pipe(
            streamTransform((row: any) => {
                 // process row
                 rowsCnt++;
                 return row;
            }),
        )
        .pipe(csvStream)
        .pipe(res, {end: false});
        rowsStream.on('error', (err: Error) => {
            // log error
            throw err;
        });
        rowsStream.on('end', () => {
             resolve(res);
             console.log('Rows count: ' + rowsCnt);
        });
    });
}

Note: Framework used is NestJS注意:使用的框架是 NestJS

I think you are ending the responses too early, when there is still processing going on in the pipeline.我认为您过早地结束了响应,因为管道中仍在进行处理。 In your code, when rowsStream emits the 'end' event, doesn't mean its the end of processing.在您的代码中,当 rowsStream 发出 'end' 事件时,并不意味着它的处理结束。 It means the last chunk of query results are in the pipeline, but still processing.这意味着最后一块查询结果在管道中,但仍在处理中。 You need to end the response after all of the processing is done and the pipeline is empty.您需要在所有处理完成并且管道为空后结束响应。 It's tricky, because you are piping multiple queries in the same response, so you use the {end: false} option.这很棘手,因为您在同一个响应中传递多个查询,因此您使用 {end: false} 选项。 Because of this, "res" won't emit the end event so you have to detect the end of processing some other way.因此,“res”不会发出结束事件,因此您必须以其他方式检测处理结束。

Here is my take:这是我的看法:

executeQuery(
    res: Response, 
    sqlQuery: sqlDto, 
    lentoClient: any, 
    csvStream: csvStringify.Stringifier
): Promise<Response> {
    return new Promise((resolve) => {
        const rowsStream = lentoClient.createRowStream(sqlQuery.query); 
        let rowsCnt = 0;
        rowsStream
        .pipe(
            streamTransform((row: any) => {
                 // process row
                 rowsCnt++;
                 return row;
            }),
        )
        .pipe(csvStream)
        .on('end', () => {
             nextTick(()=> {
                 resolve();
                 console.log('Rows count: ' + rowsCnt);
             })               
         })
        .pipe(res, {end: false});
        rowsStream.on('error', (err: Error) => {
            // log error
            throw err;
        });
    });
}

You can play around with the promisified pipe version:您可以使用承诺的 pipe 版本:

    import { pipeline } from 'stream';
    import { promisify } from 'util';
    
    const pipe = promisify(pipeline);
    
    async executeQuery(
      res: Response,
      sqlQuery: sqlDto,
      lentoClient: any,
      csvStream: csvStringify.Stringifier,
    ) {
      let rowsCnt = 0;
      try {
        await pipe(
          lentoClient.createRowStream(sqlQuery.query),
          streamTransform((row: any) => {
            // process row
            rowsCnt++;
            return row;
          }),
          csvStream.pipe(res, {end: false}),          
        );
        console.log('Rows count: ' + rowsCnt);
      } catch (error) {
        // log error
      }
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM