简体   繁体   English

如何使用nodejs将可读流转换为有效的JSON?

[英]How can I convert a readable stream to valid JSON using nodejs?

I'm trying to consume an ATOM feed of concert data and output it to JSON for a bit nicer consumption. 我正在尝试使用音乐会数据的ATOM源并将其输出到JSON以获得更好的消费。

So far I've been using request to get the data and feedparser to parse through it and it seems to be working as I'd like. 到目前为止,我一直在使用请求来获取数据和feedparser来解析它,它似乎正在按照我的意愿工作。

// data
var feed = 'http://mix.chimpfeedr.com/630a0-dcshows';
var wstream = fs.createWriteStream('data.json');

var req = request(feed);
var feedparser = new FeedParser({
        addmeta: false
    });

req.on('response', function(res) {
    var stream = this;
    if (res.statusCode != 200) return this.emit('error', new Error('Bad status code'));
    stream.pipe(feedparser)
});

feedparser.on('readable', function() {
    var stream = this;
    var item;

    // ... do some business work to get a `data` object

    wstream.write( JSON.stringify(data) + ',' );
});

This writes a file that's literally a concatenated list of these data objects: 这会写一个文件,它实际上是这些数据对象的连接列表:

{
    object1
}, {
    object2
}, {
    etc
},

This is cool but I'd like this to be wrapped in an array and I'd like the last item to not have the comma after it. 这很酷,但我希望将它包装在一个数组中,我希望最后一项没有逗号。 I'm sure there are ways I could hack around this but I think I'm missing a core concept of the stream approach and what's actually happening. 我敢肯定有一些方法可以解决这个问题,但我认为我错过了流方法的核心概念以及实际发生的事情。

So my question is: How do I manipulate the a readable stream (XML) and output an array of valid JSON? 所以我的问题是:如何操作可读流(XML)并输出有效JSON数组?

Perhaps the problem with your approach is that you are adding the comma at the end of every JSON element you put in the stream. 您的方法的问题可能是您在流中放置的每个JSON元素的末尾添加逗号。 This approach fails because you cannot be sure if there will be more data coming out of the reading stream. 此方法失败,因为您无法确定是否会有更多数据来自读取流。

So, a better approach would be to add the comma at the beginning of a JSON element, but only if you have already processed at least one element before. 因此,更好的方法是在JSON元素的开头添加逗号,但前提是您之前已经处理过至少一个元素。 For this matter, you can have a flag or a variable counting the number of elements you have processed and based on that decide if you are processing the first element or not. 对于这个问题,您可以使用一个标志或变量来计算您已处理的元素数量,并根据该数量决定您是否正在处理第一个元素。

If you are at the first element, then you add the "[" to the stream, to represent the beginning of the array, and after it you add the first element to the write stream. 如果您位于第一个元素,则将"["添加到流中,以表示数组的开头,然后将第一个元素添加到写入流中。 If you are not processing the first element, then it means you are on the second, third or n-element, in whose case, you start by adding a comma, and then your element. 如果你没有处理第一个元素,那么它意味着你在第二个,第三个或第n个元素,在这种情况下,你首先添加一个逗号,然后是你的元素。

Finally, you add a listener for the 'end' event on your read stream, that way, you get notified when you have reached the end of the data, and then you can add the closing bracket of your write stream "]" and complete a valid json array. 最后,在读取流上为'end'事件添加一个监听器,这样,当您到达数据末尾时会收到通知,然后您可以添加写入流的结束括号"]"并完成一个有效的json数组。

I have created a simplified version of this example, using some local data in my hard disk. 我已经使用硬盘中的一些本地数据创建了此示例的简化版本。 I am pretty sure you can adapt it to your case. 我很确定你可以根据你的情况调整它。

var FeedParser = require('feedparser'),
    fs = require('fs'), 
    feed = __dirname+'/rss2sample.xml';

var ws = fs.createWriteStream('data.json');
var first = true;
fs.createReadStream(feed)
  .on('error', function (error) {
    console.error(error);
  })
  .pipe(new FeedParser())
  .on('error', function (error) {
    console.error(error);
  })
  .on('readable', function() {
    var stream = this, item;
    while (item = stream.read()) {
      if(first){
        ws.write('[');
        first = false;
      } else {
        ws.write(',');
      }
      ws.write(JSON.stringify(item));
    }
  })
  .on('end', function(){
    ws.write(']');
  });

This produces a valid json file. 这会生成一个有效的json文件。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM