简体   繁体   English

Node.js 中面向行的流

[英]Line-oriented streams in Node.js

I'm developing a multi-process application using Node.js.我正在使用 Node.js 开发一个多进程应用程序。 In this application, a parent process will spawn a child process and communicate with it using a JSON-based messaging protocol over a pipe.在这个应用程序中,父进程将产生一个子进程并使用基于 JSON 的消息协议通过管道与它通信。 I've found that large JSON messages may get "cut off", such that a single "chunk" emitted to the data listener on the pipe does not contain the full JSON message.我发现大型 JSON 消息可能会被“截断”,因此发送到管道上的数据侦听器的单个“块”不包含完整的 JSON 消息。 Furthermore, small JSON messages may be grouped in the same chunk.此外,小的 JSON 消息可以分组在同一块中。 Each JSON message will be delimited by a newline character, and so I'm wondering if there is already a utility that will buffer the pipe read stream such that it emits one line at a time (and hence, for my application, one JSON document at a time).每条 JSON 消息都由换行符分隔,所以我想知道是否已经有一个实用程序可以缓冲管道读取流,以便它一次发出一行(因此,对于我的应用程序,一个 JSON 文档一次)。 This seems like it would be a pretty common use case, so I'm wondering if it has already been done.这似乎是一个非常常见的用例,所以我想知道它是否已经完成。

I'd appreciate any guidance anyone can offer.我很感激任何人都可以提供的任何指导。 Thanks.谢谢。

Maybe Pedro's carrier can help you?也许佩德罗的承运人可以帮助你?

Carrier helps you implement new-line terminated protocols over node.js. Carrier 帮助您在 node.js 上实现换行终止协议。

The client can send you chunks of lines and carrier will only notify you on each completed line.客户可以向您发送大量线路,运营商只会在每条完成的线路上通知您。

My solution to this problem is to send JSON messages each terminated with some special unicode character.我对这个问题的解决方案是发送 JSON 消息,每个消息都以一些特殊的 unicode 字符结尾。 A character that you would never normally get in the JSON string.一个通常不会出现在 JSON 字符串中的字符。 Call it TERM.称之为术语。

So the sender just does "JSON.stringify(message) + TERM;"所以发件人只是执行“JSON.stringify(message) + TERM;” and writes it.并写下来。 The reciever then splits incomming data on the TERM and parses the parts with JSON.parse() which is pretty quick.然后接收者在 TERM 上拆分传入数据并使用 JSON.parse() 解析这些部分,这非常快。 The trick is that the last message may not parse, so we simply save that fragment and add it to the beginning of the next message when it comes.诀窍是最后一条消息可能无法解析,因此我们只需保存该片段并在下一条消息到来时将其添加到下一条消息的开头。 Recieving code goes like this:接收代码是这样的:

        s.on("data", function (data) {
        var info = data.toString().split(TERM);
        info[0] = fragment + info[0];
        fragment = '';

        for ( var index = 0; index < info.length; index++) {
            if (info[index]) {
                try {
                    var message = JSON.parse(info[index]);
                    self.emit('message', message);
                } catch (error) {
                    fragment = info[index];
                    continue;
                }
            }
        }
    });

Where "fragment" is defined somwhere where it will persist between data chunks.其中“片段”是在某个地方定义的,它将在数据块之间持续存在。

But what is TERM?但什么是 TERM? I have used the unicode replacement character '\�'.我使用了 unicode 替换字符 '\�'。 One could also use the technique used by twitter where messages are separated by '\\r\\n' and tweets use '\\n' for new lines and never contain '\\r\\n'还可以使用 twitter 使用的技术,其中消息由 '\\r\\n' 分隔,推文使用 '\\n' 作为新行并且从不包含 '\\r\\n'

I find this to be a lot simpler than messing with including lengths and such like.我发现这比搞乱包括长度等要简单得多。

Simplest solution is to send length of json data before each message as fixed-length prefix (4 bytes?) and have a simple un-framing parser which buffers small chunks or splits bigger ones.最简单的解决方案是在每条消息之前发送 json 数据的长度作为固定长度的前缀(4 个字节?),并有一个简单的非成帧解析器来缓冲小块或拆分大块。

You can try node-binary to avoid writing parser manually.您可以尝试使用node-binary来避免手动编写解析器。 Look at scan(key, buffer) documentation example - it does exactly line-by line reading.查看scan(key, buffer)文档示例 - 它完全逐行读取。

As long as newlines (or whatever delimiter you use) will only delimit the JSON messages and not be embedded in them, you can use the following pattern:只要换行符(或您使用的任何分隔符)只会分隔 JSON 消息而不嵌入其中,您就可以使用以下模式:

let buf = ''
s.on('data', data => {
  buf += data.toString()
  const idx = buf.indexOf('\n')
  if (idx < 0) { return } // No '\n', no full message
  let lines = buf.split('\n')
  buf = lines.pop() // if ends in '\n' then buf will be empty
  for (let line of lines) {
    // Handle the line
  }
})

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM