简体   繁体   English

Node.js:如何使用 http.createServer 获取 stream 数据

[英]Node.js: How to stream data using http.createServer

I'm using a simple Node.js server to send a large JSON file to the client, where each line has a self-contained JSON object. I'm using a simple Node.js server to send a large JSON file to the client, where each line has a self-contained JSON object. I'd like to send this file to the client one line at a time.我想一次一行地将此文件发送给客户端。 But my problem is that the server waits until response.end() has been called to send the whole thing at once.但我的问题是服务器一直等到response.end()被调用以立即发送整个内容。

The code for my server looks like this:我的服务器的代码如下所示:

http.createServer(async function (request, response) {
   response.writeHead(200, {"Content-Type": "application/json; charset=UTF-8", "Transfer-Encoding": "chunked", "Cache-Control": "no-cache, no-store, must-revalidate", "Pragma": "no-cache", "Expires": 0});
   response.write(JSON.stringify('["The first bit of JSON content"]\n'));
   response.write(await thisFunctionTakesForever());
   response(end);
}

I really don't want to make the user wait until the entire JSON file has been loaded before my script can start parsing the results.我真的不想让用户等到整个 JSON 文件加载完毕后,我的脚本才能开始解析结果。 How can I make my server send the data in chunks?如何让我的服务器分块发送数据?


Additional info: How do I know my Node.js server isn't sending any part of the file until after response.end has been called?附加信息:我怎么知道我的 Node.js 服务器在调用response.end之后才发送文件的任何部分?

I'm using XMLHttpRequest to handle the chunks as they arrive.我正在使用XMLHttpRequest在块到达时对其进行处理。 I understand that http.responseText always grows with each chunk, so I filter through it to find the new lines that arrive each time:我知道 http.responseText 总是随着每个块增长,所以我过滤它以找到每次到达的新行:

let http = new XMLHttpRequest();
http.open('GET', url, true);
http.setRequestHeader('Content-type', 'application/x-www-form-urlencoded');
http.onreadystatechange = function() {
    if(http.readyState >= 3 && http.status == 200) {
        // Parse the data as it arrives, throwing out the ones we've already received
        // Only returning the new ones
        let json = http.responseText.trim().split(/[\n\r]+/g)
        let dataChunks = json.map(e => JSON.parse(e));

        let newResults = [];
        for(let i=0; i<dataChunks.length; i++)
        {
            if(!previousResults.map(e => e[0]).includes(dataChunks[i][0]))
            {
                newResults.push(dataChunks[i]);
            }
        }
        previousResults = previousResults.concat(newResults);
    }
}
http.send();

The array previousResults should grow slowly over time.数组previousResults应该随着时间缓慢增长。 But instead, there's a huge delay, then everything suddenly appears all at once.但是相反,有一个巨大的延迟,然后一切突然出现。

The following thread is related.以下线程是相关的。 But unfortunately, none of the proposed solutions solved my problem... Node.js: chunked transfer encoding但不幸的是,提出的解决方案都没有解决我的问题...... Node.js:分块传输编码

I saw you are using chunked encoding: "Transfer-Encoding": "chunked" .我看到您正在使用分块编码: "Transfer-Encoding": "chunked" This kind of encoding type is going to transfer each chunk individually.这种编码类型将单独传输每个块。 It's really possible to write each chunk immediately without waiting for others.真的可以立即写入每个块而无需等待其他块。

Each chunk will be encapsulated with the format defined in the RFC 2612 by the http library.每个块将使用http库在RFC 2612中定义的格式进行封装。 In general, each chunk has one line indicating the chunk size following a <CR>, <LF> .通常,每个块在<CR>, <LF>之后都有一行指示块大小。 Then you can send the chunk content.然后你可以发送块内容。 And the last chunk is an exception indicating all the chunks are finished.最后一个块是一个异常,表示所有块都完成了。

I could give you an example below:我可以在下面给你一个例子:

 const http = require("http") function generateChunk(index, res, total) { setTimeout(() => { res.write(`<p> chunk ${index}</p>`) if (index === total) { res.end() } }, index * 1000) } function handlerRequest(req, res) { res.setHeader("Content-Type", "text/html; charset=UTF-8") res.setHeader("Transfer-Encoding", "chunked") let index = 0 const total = 5 while (index <= total) { generateChunk(index, res, total) index++ } } const server = http.createServer(handlerRequest) server.listen(3000) console.log("server started at http://localhost:3000")%

If you capture the TCP packets you will see different chunks in different TCP packets.如果您捕获 TCP 数据包,您将在不同的 TCP 数据包中看到不同的块。 They don't have any dependency.他们没有任何依赖。

在此处输入图像描述

See the image:看图片:

  1. Each PSH packet carries out a chunk.每个 PSH 数据包执行一个块。
  2. There is a delay between each chunk transmission.每个块传输之间存在延迟。

However, the HTTP client (like the browser) must accept all the chunks before handing them over to the application for the reasons: Once all the chunks are received, the server could also send some headers - trailer headers .但是,HTTP 客户端(如浏览器)在将它们交给应用程序之前必须接受所有块,原因是:一旦接收到所有块,服务器也可以发送一些标头 -尾标 Those headers include Content-MD5 , Content-Length , etc. The client must verify like Content-MD5 once all chunks are received before handing them over to the application.这些标头包括Content-MD5Content-Length等。一旦收到所有块,客户端必须像Content-MD5一样进行验证,然后再将它们交给应用程序。 I think that's why you can't receive chunks one by one on the browser side.我认为这就是为什么您无法在浏览器端一一接收块的原因。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM