简体   繁体   English

使用 Nodejs Lambda 从 S3 上的 .xlsx 文件中读取数据

[英]Read data from .xlsx file on S3 using Nodejs Lambda

I'm still new in NodeJs and AWS, so forgive me if this is a noob question.我还是 NodeJs 和 AWS 的新手,所以如果这是一个菜鸟问题,请原谅我。

I am trying to read the data from an excel file (.xlsx).我正在尝试从 excel 文件 (.xlsx) 中读取数据。 The lambda function receives the extension of the file type. lambda 函数接收文件类型的扩展名。

Here is my code:这是我的代码:

exports.handler = async (event, context, callback) => {
    console.log('Received event:', JSON.stringify(event, null, 2));
    if (event.fileExt === undefined) {
        callback("400 Invalid Input");
    }
    
    let returnData = "";
    const S3 = require('aws-sdk/clients/s3');
    const s3 = new S3();
    

    switch(event.fileExt)
    {
        case "plain":
        case "txt":
            // Extract text
            const params = {Bucket: 'filestation', Key: 'MyTXT.'+event.fileExt};
            
            try {
                await s3.getObject(params, function(err, data) {
                  if (err) console.log(err, err.stack); // an error occurred
                  else{           // successful response
                      returnData = data.Body.toString('utf-8');
                      context.done(null, returnData);
                  }
                }).promise();
        
            } catch (error) {
                console.log(error);
                return;
            }  
           
            break;
        case "xls":
        case "xlsx":
            returnData = "Excel";
            // Extract text
            
            const params2 = {Bucket: 'filestation', Key: 'MyExcel.'+event.fileExt};
            const readXlsxFile = require("read-excel-file/node"); 
            
            try {     
                const doc = await s3.getObject(params2);     
                const parsedDoc = await readXlsxFile(doc);     
                console.log(parsedDoc)   
            } catch (err) {     
                console.log(err);     
                const message = `Error getting object.`;     
                console.log(message);     
                throw new Error(message);   
            } 
            
            break;
        case "docx":
            returnData = "Word doc";
            // Extract text
            break;
        default:
            callback("400 Invalid Operator");
            break;
    }
    callback(null, returnData);
};

The textfile part works.文本文件部分有效。 But the xlsx part makes the function time out.但是 xlsx 部分使函数超时。 I did install the read-excel-file dependency and uploaded the zip so that I have access to it.我确实安装了read-excel-file依赖项并上传了 zip,以便我可以访问它。 But the function times out with this message: "errorMessage": "2020-11-02T13:06:50.948Z 120bfb48-f29c-4e3f-9507-fc88125515fd Task timed out after 3.01 seconds"但是该函数超时并显示以下消息: "errorMessage": "2020-11-02T13:06:50.948Z 120bfb48-f29c-4e3f-9507-fc88125515fd Task timed out after 3.01 seconds"

Any help would be appreciated!任何帮助,将不胜感激! Thanks for your time.谢谢你的时间。

using the xlsx npm library.使用xlsx npm 库。 here's how we did it.这就是我们如何做到的。

assuming the file is under the root project path.假设文件位于根项目路径下。

const xlsx = require('xlsx');

// read your excel file
let readFile = xlsx.readFile('file_example_XLSX_5000.xlsx')

// get first-sheet's name
let sheetName = readFile.SheetNames[0];

// convert sheets to JSON. Best if sheet has a headers specified.
console.log(xlsx.utils.sheet_to_json(readFile.Sheets[sheetName]));

You need to install xlsx (SheetJs) library into the project: npm install xlsx您需要将 xlsx (SheetJs) 库安装到项目中: npm install xlsx

and then import the "read" function into the lambda, get the s3 object's body and send to xlsx like this:然后将“read”函数导入 lambda,获取 s3 对象的主体并发送到 xlsx,如下所示:

const { read } = require('sheetjs-style');
const aws = require('aws-sdk');
const s3 = new aws.S3({ apiVersion: '2006-03-01' });

exports.handler = async (event) => {
    const bucketName = 'excel-files';
    const fileKey = 'Demo Data.xlsx';

    // Simple GetObject
    let file = await s3.getObject({Bucket: bucketName, Key: fileKey}).promise();
    const wb = read(file.Body);

    const response = {
        statusCode: 200,
        body: JSON.stringify({
            read: wb.Sheets,
        }),
    };
    return response;
};

(of course, you can receive the bucket and filekey from parameters if you send them...) (当然,如果你发送它们,你可以从参数中接收bucket和filekey......)

Very Important: Use the READ (not the readFile) function and send the Body property (with capital "B") as a paremeter非常重要:使用 READ(不是 readFile)函数并将 Body 属性(大写“B”)作为参数发送

I changed the timeout to 20 seconds and it works.我将超时更改为 20 秒,它可以工作。 Only one issue remains: const parsedDoc = await readXlsxFile(doc);只剩下一个问题: const parsedDoc = await readXlsxFile(doc); wants to receive a string (filepath) and not a file.想要接收字符串(文件路径)而不是文件。

Solved by using xlsx NPM library.通过使用 xlsx NPM 库解决。 Using a stream and giving it buffers.使用流并为其提供缓冲区。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从S3文件中读取JSON,并使用带有NodeJS运行时的Lambda将记录插入dynamoDB - Read JSON from S3 file and Insert records into dynamoDB using Lambda with NodeJS runtime How to stream read an S3 JSON file to postgreSQL using async/await in a NodeJS 12 Lambda function? - How to stream read an S3 JSON file to postgreSQL using async/await in a NodeJS 12 Lambda function? 无法从 lambda 中的 s3 读取文件并在主处理程序 function nodejs 中使用它 - Unable to read file from s3 in lambda and use it in main handler function nodejs 使用Node.js Lambda的S3文件上传问题 - S3 file upload issue using nodejs lambda NodeJS-将文件从S3读取到Lambda中的/ tmp文件夹 - NodeJS - reading file from S3 to /tmp folder in Lambda 从 lambda (Nodejs) 上传 multipart/form-data 到 S3 - Upload multipart/form-data to S3 from lambda (Nodejs) 使用Node.js AWS Lambda从S3加载并解析yaml文件 - Load and parse an yaml file from S3, using a nodejs AWS lambda 使用nodejs从lambda触发器调整S3存储桶中的图像大小 - Resizing the image in S3 bucket from lambda trigger using nodejs 使用Lambda / S3逐行读取文件 - Read a file line by line using Lambda / S3 如何从S3存储桶读取大型XML文件,然后使用AWS Lambda将其用作HTTP请求正文 - How to read a large XML file from S3 bucket and then use it as an HTTP request body using AWS Lambda
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM