[英]Not getting results when attempting to read the content of an S3 bucket from AWS Lambda function in Nodejs
After the cross account migration of dynamodb tables from a different account to our own AWS account, I have a requirement to use a nodejs lambda to read and process text files that contains json.在将 dynamodb 表从不同账户跨账户迁移到我们自己的 AWS 账户后,我需要使用 nodejs lambda 来读取和处理包含 json 的文本文件。 The source AWS Datapipeline that ran the import job by creating an EMR cluster dropped 5 MB files in an S3 bucket in the source account (not our account) with the object keys in the format dynamodbtablename/manifest
and dynamodbtablename/2c561e6c-62ba-4eab-bf21-7f685c7c3129
.通过创建 EMR 集群运行导入作业的源 AWS Datapipeline 在源账户(不是我们的账户)的 S3 存储桶中删除了 5 MB 文件,对象键的格式为dynamodbtablename/manifest
和dynamodbtablename/2c561e6c-62ba-4eab-bf21-7f685c7c3129
。 The manifest file contains the following sample data:清单文件包含以下示例数据:
{"name":"DynamoDB-export","version":3,
"entries": [
{"url":"s3://bucket/dynamodbtablename/2c561e6c-62ba-4eab-bf21-7f685c7c3129","mandatory":true}
]}
I have been battling reading the manifest file for the most of today.我今天大部分时间都在努力阅读清单文件。 Although not getting access issues in the lambda, I initially I had to deal with setting the cross-account policies and permissions on the resources in terraform.虽然在 lambda 中没有遇到访问问题,但我最初必须处理在 terraform 中设置资源的跨账户策略和权限。 My problem now is that the code that calls s3.getObject
doesn't seem to get hit.我现在的问题是调用s3.getObject
的代码似乎没有被命中。
/* eslint-disable no-console, no-param-reassign */
const AWS = require('aws-sdk');
const massiveTables = [
'dynamodbtablename'
];
function getS3Objects(params) {
let s3 = new AWS.S3({
apiVersion: '2012-10-29'
});
return new Promise((resolve, reject) => {
s3.getObject(params, (err, data) => {
if (err) {
reject(err);
} else {
resolve(data);
}
});
});
}
const handler = async ({ Records }) => {
const completelyProcessedSNSPromises = Records.map(async ({ Sns: { Message: tableName } }) => {
console.log(`tableName: ${tableName}`);
let massiveTableItem = tableName.trim();
console.log(`massiveTableItem: ${massiveTableItem}`);
//#1: Validate the the right table names are coming through
if (massiveTables.includes(massiveTableItem)) {
//#2: Use the table name to fetch the right keys from the S3 bucket
let params = {
Bucket: process.env.DATA_BUCKET,
Key: `${massiveTableItem}/manifest`,
ResponseContentType: 'application/json'
};
getS3Objects(params)
.then(result => {
console.log(`result: ${result}`);
})
.catch(error => {
console.log(`error: ${error}`);
});
}
});
await Promise.all(completelyProcessedSNSPromises)
.then(console.log)
.catch(console.error);
};
module.exports.handler = handler;
This is what I am getting in the Cloudwatch logs这就是我在 Cloudwatch 日志中得到的信息
16:13:25
2020-03-11T16:13:25.271Z 8bd74c44-c9b1-4cd9-a360-251ad4253eae INFO tableName: dynamodbtablename
2020-03-11T16:13:25.271Z 8bd74c44-c9b1-4cd9-a360-251ad4253eae INFO tableName: dynamodbtablename
16:13:25
2020-03-11T16:13:25.271Z 8bd74c44-c9b1-4cd9-a360-251ad4253eae INFO massiveTableItem: dynamodbtablename
2020-03-11T16:13:25.271Z 8bd74c44-c9b1-4cd9-a360-251ad4253eae INFO massiveTableItem: dynamodbtablename
16:13:25
2020-03-11T16:13:25.338Z 8bd74c44-c9b1-4cd9-a360-251ad4253eae INFO [ undefined ]
2020-03-11T16:13:25.338Z 8bd74c44-c9b1-4cd9-a360-251ad4253eae INFO [ undefined ]
Please help me know what I am doing wrong.请帮助我知道我做错了什么。
Thank you very much in advance.非常感谢您提前。 PS: I'm new to Nodejs/Javascript PS:我是 Nodejs/Javascript 的新手
You need to return the getS3Objects
call in your handler function.您需要在处理程序函数中返回getS3Objects
调用。 You are not returning so map doesn't return promises and hence it is not called.您没有返回,因此 map 不会返回 promise,因此不会调用它。
Also, aws-sdk has support for promises so you don't need to wrap them in promise.此外,aws-sdk 支持承诺,因此您无需将它们包装在承诺中。 something like this像这样的东西
/* eslint-disable no-console, no-param-reassign */
const AWS = require("aws-sdk");
const massiveTables = [
"dynamodbtablename"
];
function getS3Objects(params) {
const s3 = new AWS.S3({
"apiVersion": "2012-10-29"
});
return s3.getObject(params).promise();
}
// eslint-disable-next-line func-style
const handler = async ({Records}) => {
const completelyProcessedSNSPromises = Records.map(async ({"Sns": {"Message": tableName}}) => {
console.log(`tableName: ${tableName}`);
const massiveTableItem = tableName.trim();
console.log(`massiveTableItem: ${massiveTableItem}`);
// #1: Validate the the right table names are coming through
if (massiveTables.includes(massiveTableItem)) {
// #2: Use the table name to fetch the right keys from the S3 bucket
const params = {
"Bucket": process.env.DATA_BUCKET,
"Key": `${massiveTableItem}/manifest`,
"ResponseContentType": "application/json"
};
return getS3Objects(params);
}
});
await Promise.all(completelyProcessedSNSPromises)
.then(console.log)
.catch(console.error);
};
module.exports.handler = handler;
Thank you all for your help.谢谢大家的帮助。
I discovered the issue to be that the async function that I was calling within the async lambda handler were not able to execute due to scope issues regarding their inability to access the enclosing async lambda handler's scope.我发现问题是我在异步 lambda 处理程序中调用的异步函数由于无法访问封闭的异步 lambda 处理程序范围的范围问题而无法执行。 This happened while using the Array map and forEach functions.这发生在使用数组映射和 forEach 函数时。
I resorted to using the traditional for loop.我求助于使用传统的 for 循环。
for (let i = 0; i < Records.length; i++) {
const tableName = Records[i].Sns.Message;
console.log(`DAZN tableName: ${tableName}`);
const tableIndex = daznTables.findIndex(t => tableName.includes(t));
const massiveTableItem = massiveTables[tableIndex];
console.log(`massiveTableItem: ${massiveTableItem}`);
const dataBucket = process.env.DATA_BUCKET;
}
As didn't really need to return anything from the .map function, I got rid of由于实际上并不需要从 .map 函数返回任何内容,因此我摆脱了
await Promise.all(completelyProcessedSNSPromises)
.then(console.log)
.catch(console.error);
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.