简体   繁体   中英

How to use node js transform stream as a read stream?

I am attempting to download, modify in place, and re-upload a file to Amazon S3 using the AWS-SDK in node.js. I am new to node, and after some googling, I opted to try implementing this logic using streams. I created a custom transform stream by subclassing stream.Transform and supplying a transform function. My current implementation is:

// Download and modify file.
var outputStream = s3.getObject(getParams).
    createReadStream().
    pipe(transformStream);

// Upload modified file by passing outputStream as body to s3.putObject.
// s3.putObjectWrapper is a promise wrapper for the api function putObject.
s3.putObjectWrapper({body: outputStream, ...}).
    then((data) => {
        logger.debug("Put Success: ", {data: data});
    }).
    catch((err) => {
        logger.error("Put Error: ", {error: err});
    });

Which yields the following error output:

error: Put Error: message=Cannot determine length of [object Object], objectMode=false, highWaterMark=16384, head=null, tail=null, length=0, length=0, pipes=null, pipesCount=0, flowing=null, ended=false, endEmitted=false, reading=false, sync=false, needReadable=true, emittedReadable=false, readableListening=false, resumeScheduled=false, defaultEncoding=utf8, ranOut=false, awaitDrain=0, readingMore=false, decoder=null, encoding=null, readable=true, domain=null, end=function

I have read the node documentation on streams here (see link below). I did not find them helpful and I am unsure if I also have to implement stream.Read methods in my custom transform stream class, of which transformStream is an instance, to support readability of the stream. Note also that the function s3.putObject accepts a buffer, stream, or string as its body. So, if I could implement the same functionality using a buffer instead of passing a stream to putObject that would be useful. node.js streams: https://nodejs.org/dist/latest-v10.x/docs/api/stream.html . aws-sdk S3 api: https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#putObject-property

In sum, I am unsure what is wrong with my implementation and if using streams is a viable way to accomplish the task I am trying to perform.

There is an issue with the s3.putObject , which only supports streams created with fs.creatReadStream , but you can work around this issue, setting the length of the stream yourself. The problem is that you will need to know the length of the stream beforehand, if you don't know it, which is likely, since you're transforming it, you will need to pipe it to a file, and then pass a readable stream using fs.createReadStream . Or better yet, use s3.upload instead which will allow you to use any readable stream.

Using s3.upload :

const params = { Bucket: 'bucket', Key: 'Filename', Body: stream };
s3.upload(params, (err, data) => {
  console.log(err, data);
});

Using s3.putObject

// This will work if you know the length beforehand
outputStream.length = getStreamLength(); 

s3.putObjectWrapper({ body: outputStream })

The following will work. Even though may not be what anyone would expect when working with streams.

const writeStream = fs.createWriteStream('/tmp/testing');

var outputStream = s3.getObject(getParams)
    .createReadStream().
    .pipe(transformStream)
    .pipe(writeStream)


 writeStream.on('close', () => {

    const readStream = fs.createReadStream('/tmp/testing');

    s3.putObjectWrapper({
      body: readStream
    })
    .then(data => {
      logger.debug("Put Success: ", { data: data });
    })
    .catch(err => {
      logger.error("Put Error: ", { error: err });
    });
});

Actually, you can use Transform when you specify ContentLength property to the PutObjectCommand command

import { createReadStream } from 'node:fs'
import { stat } from 'node:fs/promises'
import { S3, PutObjectCommand } from '@aws-sdk/client-s3'

const read = createReadStream(src, {})
const { size } = await stat(src)

// create your transform stream
const transform = new Transform({
  transform(chunk, encoding, callback) {
    try {
      // read / update stream as you needs
      this.push(chunk)
      callback()
    } catch (error) {
      callback(error)
    }
  }
})

// get the new stream
const transformed = read.pipe(transform)

// upload transformed stream to s3
const s3 = new S3({ /* ... */ })
s3.send(
  new PutObjectCommand({
    Bucket: config.digitalOcean.releasesBucket,
    Key: key,
    Body: transformed,

    // the ContentLength is required here when using transform
    ContentLength: size,
  })
)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM