简体   繁体   中英

Not getting results when attempting to read the content of an S3 bucket from AWS Lambda function in Nodejs

After the cross account migration of dynamodb tables from a different account to our own AWS account, I have a requirement to use a nodejs lambda to read and process text files that contains json. The source AWS Datapipeline that ran the import job by creating an EMR cluster dropped 5 MB files in an S3 bucket in the source account (not our account) with the object keys in the format dynamodbtablename/manifest and dynamodbtablename/2c561e6c-62ba-4eab-bf21-7f685c7c3129 . The manifest file contains the following sample data:

{"name":"DynamoDB-export","version":3,
"entries": [
{"url":"s3://bucket/dynamodbtablename/2c561e6c-62ba-4eab-bf21-7f685c7c3129","mandatory":true}
]}

I have been battling reading the manifest file for the most of today. Although not getting access issues in the lambda, I initially I had to deal with setting the cross-account policies and permissions on the resources in terraform. My problem now is that the code that calls s3.getObject doesn't seem to get hit.

/* eslint-disable no-console, no-param-reassign */

const AWS = require('aws-sdk');

const massiveTables = [
  'dynamodbtablename'
];

function getS3Objects(params) {
  let s3 = new AWS.S3({
    apiVersion: '2012-10-29'
  });
  return new Promise((resolve, reject) => {
    s3.getObject(params, (err, data) => {
      if (err) {
        reject(err);
      } else {
        resolve(data);
      }
    });
  });
}

const handler = async ({ Records }) => {
  const completelyProcessedSNSPromises = Records.map(async ({ Sns: { Message: tableName } }) => {
    console.log(`tableName: ${tableName}`);
    let massiveTableItem = tableName.trim();
    console.log(`massiveTableItem: ${massiveTableItem}`);
    //#1: Validate the the right table names are coming through
    if (massiveTables.includes(massiveTableItem)) {
      //#2: Use the table name to fetch the right keys from the S3 bucket

      let params = {
        Bucket: process.env.DATA_BUCKET,
        Key: `${massiveTableItem}/manifest`,
        ResponseContentType: 'application/json'
      };

      getS3Objects(params)
        .then(result => {
          console.log(`result: ${result}`);
        })
        .catch(error => {
          console.log(`error: ${error}`);
        });
    }
  });

  await Promise.all(completelyProcessedSNSPromises)
    .then(console.log)
    .catch(console.error);
};

module.exports.handler = handler;

This is what I am getting in the Cloudwatch logs


16:13:25
2020-03-11T16:13:25.271Z    8bd74c44-c9b1-4cd9-a360-251ad4253eae    INFO    tableName: dynamodbtablename
2020-03-11T16:13:25.271Z    8bd74c44-c9b1-4cd9-a360-251ad4253eae    INFO    tableName: dynamodbtablename

16:13:25
2020-03-11T16:13:25.271Z    8bd74c44-c9b1-4cd9-a360-251ad4253eae    INFO    massiveTableItem: dynamodbtablename
2020-03-11T16:13:25.271Z    8bd74c44-c9b1-4cd9-a360-251ad4253eae    INFO    massiveTableItem: dynamodbtablename

16:13:25
2020-03-11T16:13:25.338Z    8bd74c44-c9b1-4cd9-a360-251ad4253eae    INFO    [ undefined ]
2020-03-11T16:13:25.338Z    8bd74c44-c9b1-4cd9-a360-251ad4253eae    INFO    [ undefined ]

Please help me know what I am doing wrong.

Thank you very much in advance. PS: I'm new to Nodejs/Javascript

You need to return the getS3Objects call in your handler function. You are not returning so map doesn't return promises and hence it is not called.

Also, aws-sdk has support for promises so you don't need to wrap them in promise. something like this

/* eslint-disable no-console, no-param-reassign */

const AWS = require("aws-sdk");

const massiveTables = [
  "dynamodbtablename"
];

function getS3Objects(params) {
  const s3 = new AWS.S3({
    "apiVersion": "2012-10-29"
  });
  return s3.getObject(params).promise();
}

// eslint-disable-next-line func-style
const handler = async ({Records}) => {
  const completelyProcessedSNSPromises = Records.map(async ({"Sns": {"Message": tableName}}) => {
    console.log(`tableName: ${tableName}`);
    const massiveTableItem = tableName.trim();
    console.log(`massiveTableItem: ${massiveTableItem}`);
    // #1: Validate the the right table names are coming through
    if (massiveTables.includes(massiveTableItem)) {
      // #2: Use the table name to fetch the right keys from the S3 bucket

      const params = {
        "Bucket": process.env.DATA_BUCKET,
        "Key": `${massiveTableItem}/manifest`,
        "ResponseContentType": "application/json"
      };

      return getS3Objects(params);
    }
  });

  await Promise.all(completelyProcessedSNSPromises)
    .then(console.log)
    .catch(console.error);
};

module.exports.handler = handler;

Thank you all for your help.

I discovered the issue to be that the async function that I was calling within the async lambda handler were not able to execute due to scope issues regarding their inability to access the enclosing async lambda handler's scope. This happened while using the Array map and forEach functions.

I resorted to using the traditional for loop.

for (let i = 0; i < Records.length; i++) {
    const tableName = Records[i].Sns.Message;
    console.log(`DAZN tableName: ${tableName}`);
    const tableIndex = daznTables.findIndex(t => tableName.includes(t));
    const massiveTableItem = massiveTables[tableIndex];
    console.log(`massiveTableItem: ${massiveTableItem}`);
    const dataBucket = process.env.DATA_BUCKET;
}

As didn't really need to return anything from the .map function, I got rid of

  await Promise.all(completelyProcessedSNSPromises)
    .then(console.log)
    .catch(console.error);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM