简体   繁体   中英

How to write a JSON array to a file with Node.js writeStream?

I wrote a little Node.js script to scrape data from a website on which I'm iterating through pages to extract structured data.

The data I extract for each page is an a form of an array of objects.

I thought I could use fs.createWriteStream() method to create a writable stream on which I could write the data incrementally after each page extraction.

Apparently, you can only write a String or a Buffer to the stream, so I'm doing something like this:

output.write(JSON.stringify(operations, null, 2));

But in the end, once I close the stream, the JSON is malformatted because obvisously I just appended every array of each page one after the other, resulting in something looking like this:

[
    { ... },  /* data for page 1 */
    { ... }
][ /* => here is the problem */
    { ... },  /* data for page 2 */
    { ... }
]

How could I proceed to actually append the arrays into the output instead of chaining them? Is it even do-able?

Your options would be...

  1. Keep full array in memory and only write to the json file at the end, after processing all pages.
  2. Write each object individually, and handle the square brackets and commas manually.

Something like this...

//start processing
output.write('[');
//loop through your pages, however you're doing that
while (more_data_to_read()) {
    //create "operation" object
    var operation = get_operation_object();
    output.write(JSON.stringify(operation, null, 2));
    if (!is_last_page()) {
        //write out comma to separate operation objects within array
        output.write(',');
    }
}
//all done, close the json array
output.write(']');

This will create well-formed json.

Personally, I would opt for #1 though, as it seems the more 'correct' way to do it. If you're concerned about the array using too much memory, then json may not be the best choice for the data file. It's not particularly well suited to extremely large data-sets.

In the code sample above, if the process got interrupted partway through, then you'll have an invalid json file, so writing progressively won't actually make the application more fault-tolerant.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM