简体繁体中英

Import pyspark in AWS Lambda function

原文 2022-09-30 02:48:00 5 1 python/ amazon-web-services/ apache-spark/ pyspark/ aws-lambda

I created an ETL job in AWS Glue that creates an ORC file with only one raw (that indicates if two other files have the same count of rows).

Now in my pipeline I created an AWS Lambda function to try to read that ORC file and ask if the count of rows is equal in both tables (this ORC file stored in S3 has a value column that indicates if there exists a difference in the counts or not with a 1 and a 0 respectively).

In my first attempt I was trying to use pandas but Lambda gave me the error:

Unable to import module 'lambda_function': No module named

Now I'm trying to import pyspark context like sparksession and use df = spark.read.orc() , but is giving me the same error:

Unable to import module 'lambda_function': No module named 'pyspark'

What do you think? How could I instantiate sparksession in my Lambda function or read the ORC file in another way?

Thank you very much!

1 answers

Unable to import module 'lambda_function': No module named 'pyspark'

This mean the Lambda can not found the depenedncy, upload a dependency zip file as a Layer for your Lambda .

How to create Layer ?
https://docs.aws.amazon.com/lambda/latest/dg/configuration-layers.html
Or you can add the Layer that built by other contributors.
https://github.com/keithrozario/Klayers

AWS Lambda - unable to import module 'lambda_function'

Unable to import module 'lambda_function': No module named 'psycopg2._psycopg aws lambda function

AWS Lambda Layer Unable to import module 'lambda_function': No module named 'pyarrow.lib'

aws lambda Unable to import module 'lambda_function': No module named 'requests'

Import error ModuleNotFound for Airflow aws_lambda

AWS Lambda Opencv ("Unable to import module 'lambda_function': libgthread-2.0.so.0: cannot open shared object file: No such file or directory")

AWS Lambda function: not authorized to perform

Rundeck to run AWS Lambda function

AWS Lambda step function Integration

AWS Lambda - Invoke a method of another lambda function

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question AWS Lambda - unable to import module 'lambda_function' Unable to import module 'lambda_function': No module named 'psycopg2._psycopg aws lambda function AWS Lambda Layer Unable to import module 'lambda_function': No module named 'pyarrow.lib' aws lambda Unable to import module 'lambda_function': No module named 'requests' Import error ModuleNotFound for Airflow aws_lambda AWS Lambda Opencv ("Unable to import module 'lambda_function': libgthread-2.0.so.0: cannot open shared object file: No such file or directory") AWS Lambda function: not authorized to perform Rundeck to run AWS Lambda function AWS Lambda step function Integration AWS Lambda - Invoke a method of another lambda function

Related Tags

Import pyspark in AWS Lambda function

Question

1 answers

solution1 0 2022-09-30 03:35:46

solution1
0 2022-09-30 03:35:46