[英]Read data from .xlsx file on S3 using Nodejs Lambda
I'm still new in NodeJs and AWS, so forgive me if this is a noob question.我还是 NodeJs 和 AWS 的新手,所以如果这是一个菜鸟问题,请原谅我。
I am trying to read the data from an excel file (.xlsx).我正在尝试从 excel 文件 (.xlsx) 中读取数据。 The lambda function receives the extension of the file type. lambda 函数接收文件类型的扩展名。
Here is my code:这是我的代码:
exports.handler = async (event, context, callback) => {
console.log('Received event:', JSON.stringify(event, null, 2));
if (event.fileExt === undefined) {
callback("400 Invalid Input");
}
let returnData = "";
const S3 = require('aws-sdk/clients/s3');
const s3 = new S3();
switch(event.fileExt)
{
case "plain":
case "txt":
// Extract text
const params = {Bucket: 'filestation', Key: 'MyTXT.'+event.fileExt};
try {
await s3.getObject(params, function(err, data) {
if (err) console.log(err, err.stack); // an error occurred
else{ // successful response
returnData = data.Body.toString('utf-8');
context.done(null, returnData);
}
}).promise();
} catch (error) {
console.log(error);
return;
}
break;
case "xls":
case "xlsx":
returnData = "Excel";
// Extract text
const params2 = {Bucket: 'filestation', Key: 'MyExcel.'+event.fileExt};
const readXlsxFile = require("read-excel-file/node");
try {
const doc = await s3.getObject(params2);
const parsedDoc = await readXlsxFile(doc);
console.log(parsedDoc)
} catch (err) {
console.log(err);
const message = `Error getting object.`;
console.log(message);
throw new Error(message);
}
break;
case "docx":
returnData = "Word doc";
// Extract text
break;
default:
callback("400 Invalid Operator");
break;
}
callback(null, returnData);
};
The textfile part works.文本文件部分有效。 But the xlsx part makes the function time out.但是 xlsx 部分使函数超时。 I did install the read-excel-file
dependency and uploaded the zip so that I have access to it.我确实安装了read-excel-file
依赖项并上传了 zip,以便我可以访问它。 But the function times out with this message: "errorMessage": "2020-11-02T13:06:50.948Z 120bfb48-f29c-4e3f-9507-fc88125515fd Task timed out after 3.01 seconds"
但是该函数超时并显示以下消息: "errorMessage": "2020-11-02T13:06:50.948Z 120bfb48-f29c-4e3f-9507-fc88125515fd Task timed out after 3.01 seconds"
Any help would be appreciated!任何帮助,将不胜感激! Thanks for your time.谢谢你的时间。
using the xlsx npm library.使用xlsx npm 库。 here's how we did it.这就是我们如何做到的。
assuming the file is under the root project path.假设文件位于根项目路径下。
const xlsx = require('xlsx');
// read your excel file
let readFile = xlsx.readFile('file_example_XLSX_5000.xlsx')
// get first-sheet's name
let sheetName = readFile.SheetNames[0];
// convert sheets to JSON. Best if sheet has a headers specified.
console.log(xlsx.utils.sheet_to_json(readFile.Sheets[sheetName]));
You need to install xlsx (SheetJs) library into the project: npm install xlsx您需要将 xlsx (SheetJs) 库安装到项目中: npm install xlsx
and then import the "read" function into the lambda, get the s3 object's body and send to xlsx like this:然后将“read”函数导入 lambda,获取 s3 对象的主体并发送到 xlsx,如下所示:
const { read } = require('sheetjs-style');
const aws = require('aws-sdk');
const s3 = new aws.S3({ apiVersion: '2006-03-01' });
exports.handler = async (event) => {
const bucketName = 'excel-files';
const fileKey = 'Demo Data.xlsx';
// Simple GetObject
let file = await s3.getObject({Bucket: bucketName, Key: fileKey}).promise();
const wb = read(file.Body);
const response = {
statusCode: 200,
body: JSON.stringify({
read: wb.Sheets,
}),
};
return response;
};
(of course, you can receive the bucket and filekey from parameters if you send them...) (当然,如果你发送它们,你可以从参数中接收bucket和filekey......)
Very Important: Use the READ (not the readFile) function and send the Body property (with capital "B") as a paremeter非常重要:使用 READ(不是 readFile)函数并将 Body 属性(大写“B”)作为参数发送
I changed the timeout to 20 seconds and it works.我将超时更改为 20 秒,它可以工作。 Only one issue remains: const parsedDoc = await readXlsxFile(doc);
只剩下一个问题: const parsedDoc = await readXlsxFile(doc);
wants to receive a string (filepath) and not a file.想要接收字符串(文件路径)而不是文件。
Solved by using xlsx NPM library.通过使用 xlsx NPM 库解决。 Using a stream and giving it buffers.使用流并为其提供缓冲区。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.