存储为 JavaScript 缓冲区的 Un-TAR 和 un-GZip 文件

Question

I am developing a server script on Node.js/Express.js that receives uploaded .tar.gz archives with multiple files.我正在 Node.js/Express.js 上开发一个服务器脚本，它接收带有多个文件的上传的 .tar.gz 档案。 The script has to untar and ungzip CSV files in archives, parse them and store some in database.该脚本必须解压和解压存档中的 CSV 文件，解析它们并将一些存储在数据库中。 There is no need to store files on the server, just process them.无需在服务器上存储文件，只需对其进行处理即可。 To upload files I am using Multer without specifying where to store files, so file uploads are only available in req.files as Buffer .要上传文件，我使用 Multer 而不指定存储文件的位置，因此文件上传仅在req.files作为Buffer可用。

My question is, how is it possible to untar and ungzip Buffer to get the contents of the files?我的问题是，如何解压和解压 Buffer 以获取文件的内容？ If I do something like:如果我做这样的事情：

const { unzipSync } = require('zlib');

const zipped = req.files[0];
const result = await unzipSync(zipped.buffer);
const str = result.toString('utf-8');

I get not the content of the file, but all information including file name, some metadata etc as string, which is tricky to parse.我得到的不是文件的内容，而是包括文件名、一些元数据等在内的所有信息作为字符串，这很难解析。 Is there a better way?有没有更好的办法？

Answer 1

I managed to untar and unzip Buffer using tar-stream and streamifier libraries.我设法使用tar-stream和streamifier库解压和解压 Buffer。

const tar = require('tar-stream');
const streamifier = require('streamifier');
const { unzipSync } = require('zlib');

const untar = ({ buffer }) => new Promise((resolve, reject) => {
  // Buffer is representation of .tar.gz file uploaded to Express.js server
  // using Multer middleware with MemoryStorage
  const textData = [];
  const extract = tar.extract();
  // Extract method accepts each tarred file as entry, separating header and stream of contents:
  extract.on('entry', (header, stream, next) => {
    const chunks = [];
    stream.on('data', (chunk) => {
      chunks.push(chunk);
    });
    stream.on('error', (err) => {
      reject(err);
    });
    stream.on('end', () => {
      // We concatenate chunks of the stream into string and push it to array, which holds contents of each file in .tar.gz:
      const text = Buffer.concat(chunks).toString('utf8');
      textData.push(text);
      next();
    });
    stream.resume();
  });
  extract.on('finish', () => {
    // We return array of tarred files's contents:
    resolve(textData);
  });
  // We unzip buffer and convert it to Readable Stream and then pass to tar-stream's extract method:
  streamifier.createReadStream(unzipSync(buffer)).pipe(extract);
});

Using this approach I managed to avoid storing any temporary files on filesystem and process all files' contents in memory exclusively.使用这种方法，我设法避免在文件系统上存储任何临时文件，而是专门处理内存中的所有文件内容。

存储为 JavaScript 缓冲区的 Un-TAR 和 un-GZip 文件

问题描述

1 个解决方案

解决方案1
4 2019-12-06 19:10:25

存储为 JavaScript 缓冲区的 Un-TAR 和 un-GZip 文件

问题描述

1 个解决方案

解决方案1 4 2019-12-06 19:10:25

解决方案1
4 2019-12-06 19:10:25