简体   繁体   English

如何创建任何 AWS Lambda Python 层? (XGBoost 的使用示例)

[英]How to create any AWS Lambda Python Layer? (Usage example with XGBoost)

I am having trouble creating a lambda layer for the xgboost library.我在为 xgboost 库创建 lambda 层时遇到问题。 Im running:我在跑:

Im grabbing a zip of xgboost and it's dependencies from here ( https://github.com/alexeybutyrev/aws_lambda_xgboost ) and loading it into a layer.我从这里( https://github.com/alexeybutyrev/aws_lambda_xgboost )抓取 xgboost 的 zip 及其依赖项并将其加载到层中。 When I try to test my lambda, I get this error:当我尝试测试我的 lambda 时,我收到此错误:

Unable to import module 'lambda_function': No module named 'xgboost.core'

It looks like __init__.py is trying to reference core.py via from.core import <stuff>看起来__init__.py正在尝试通过from.core import <stuff>引用 core.py

Has anyone encountered this error with AWS Lambda before?有没有人在使用 AWS Lambda 之前遇到过这个错误?

EDIT: As @Marcin has remark, the first answer provided works for packages under 262 MB large.编辑:正如@Marcin 所说,提供的第一个答案适用于 262 MB 以下的包。

A. Python Packages within Lambda Layer size limit A. Python Lambda 层大小限制内的包

You can also do it with AWS sam cli and Docker (see this link to install the SAM cli), to build the packages inside a container.您也可以使用 AWS sam cli 和 Docker(请参阅此链接以安装 SAM cli)来在容器内构建包。 Basically you initialize a default template with Python as runtime and then you specify the packages under the requirements.txt file.基本上,您使用Python作为运行时初始化默认模板,然后在requirements.txt文件下指定包。 I found it more easy than the article you mentioned.我发现它比你提到的文章更容易。 I let you steps if you want to consider them for future use.如果您想考虑它们以供将来使用,我会让您采取步骤。

1. Initialize a default SAM template 1.初始化一个默认的SAM模板

Under any folder that you want to keep the project, you can type在要保留项目的任何文件夹下,您可以键入

sam init

this will prompt a series of questions, for a quick set up we will be choosing the Quick Start Templates as follows这将提示一系列问题,为了快速设置,我们将选择快速启动模板,如下所示

1 - AWS Quick Start Templates

2 - Python 3.8

Project name [sam-app]: your_project_name

1 - Hello World Example

By choosing the Hello World Example it generates a default lambda function with a requirements.txt file.通过选择Hello World Example ,它会生成默认lambda functionrequirements.txt文件。 Now, we're going to edit with the name of the package that you want, in this case xgboost现在,我们将使用您想要的 package 的名称进行编辑,在本例中xgboost

2. Specify packages to install 2.指定要安装的包

cd your_project_name
code hello_world/requirements.txt

as I have Visual Studio Code as editor, this will open the file on it.因为我有 Visual Studio Code 作为编辑器,这将打开它上面的文件。 Now, I can specify the xgboost package现在,我可以指定xgboost package

your_python_package

Here comes the reason to have Docker installed.这是安装 Docker 的原因。 Some packages relied on C++ .一些软件包依赖于C++ Thus, it is recommended to build inside a container (case on Windows).因此,建议在容器内构建(Windows 上的情况)。 Now, move to the folder where the template.yaml file is located.现在,移动到template.yaml文件所在的文件夹。 Then, type然后,输入

sam build -u

3. Zip packages 3. Zip 封装

there are some files that you do not want to be included in your lambda layer, because we only want to keep the python libraries.有些文件您不想包含在 lambda 层中,因为我们只想保留 python 库。 Thus, you could remove the following files因此,您可以删除以下文件

rm .aws-sam/build/HelloWorldFunction/app.py
rm .aws-sam/build/HelloWorldFunction/__init__.py
rm .aws-sam/build/HelloWorldFunction/requirements.txt

and then zip the remaining content of the folder.然后 zip 文件夹的剩余内容。

cp -r .aws-sam/build/HelloWorldFunction/ python/
zip -r my_layer.zip python/

where we place the layer in the python/ folder according to the docs On Windows system the zip command should be replaced with Compress-Archive my_layer/ my_layer.zip.我们根据文档将图层放在python/文件夹中。在 Windows 系统上, zip命令应替换为Compress-Archive my_layer/ my_layer.zip.

4. Upload your Layer to AWS 4. 将您的层上传到 AWS

On AWS go to Lambda , then choose Layers and Create Layer .在 AWS go 到Lambda ,然后选择LayersCreate Layer Now, you can upload your .zip file as the image below shows现在,您可以上传您的.zip文件,如下图所示

在此处输入图像描述

Notice that for zip files over 50 MB, you should upload the .zip file to an s3 bucket and provide the path, for exampl, https://s3:amazonaws.com//mybucket/my_layer.zip . Notice that for zip files over 50 MB, you should upload the .zip file to an s3 bucket and provide the path, for exampl, https://s3:amazonaws.com//mybucket/my_layer.zip .

B. Python packages that exceeds Lambda Layer limits B. 超出 Lambda 层限制的 Python 包

The xgboost package on its own is more than 300 MB and will throw the following error xgboost package 本身超过 300 MB 会抛出以下错误

在此处输入图像描述

As @Marcin has kindly pointed out, the prior approach with SAM cli would not directly work for Python layers that exceed the limit.正如@Marcin 善意地指出的那样,使用 SAM cli 的先前方法不会直接适用于超出限制的 Python 层。 There's an open issue on github to specify a custom docker image when running sam build -u and a possible solution retagging the default lambda/lambci image. github上存在一个未解决的问题,以在运行sam build -u时指定自定义 docker 图像以及重新标记默认lambda/lambci图像的可能解决方案。

So, how could we pass through this?.那么,我们怎么能通过这个呢? There are already some useful resources that I would just point to.我只想指出一些有用的资源。

  • First, the Medium article that @Alex took as solution that follow this repo code .首先,@Alex 采用的Medium文章作为遵循此repo 代码的解决方案。
  • Second, alexeybutyrev approach that works by applying the strip command to reduce the libraries sizes.其次, alexeybutyrev方法通过应用strip命令来减少库大小。 One can find this approach under a github repo , the instructions are provided.可以在 github repo下找到这种方法,提供了说明。

Edit (December 2020)编辑(2020 年 12 月)

This month AWS releases container Image support for AWS Lambda.本月 AWS 发布了对 AWS Lambda 的容器映像支持。 Following the next tree structure for your project遵循项目的下一个树结构

Project/
|-- app/
|   |-- app.py
|   |-- requirements.txt
|   |-- xgb_trained.bin
|-- Dockerfile
 

You can deploy an XGBoost model with the following Docker image.您可以使用以下 Docker 映像部署 XGBoost model。 Follow this repo instructions for a detailed explanation.请按照此repo说明进行详细说明。

# Dockerfile based on https://docs.aws.amazon.com/lambda/latest/dg/images-create.html

# Define global args
ARG FUNCTION_DIR="/function"
ARG RUNTIME_VERSION="3.6"

# Choose buster image
FROM python:${RUNTIME_VERSION}-buster as base-image

# Install aws-lambda-cpp build dependencies
RUN apt-get update && \
  apt-get install -y \
  g++ \
  make \
  cmake \
  unzip \
  libcurl4-openssl-dev \
  git


# Include global arg in this stage of the build
ARG FUNCTION_DIR
# Create function directory
RUN mkdir -p ${FUNCTION_DIR}

# Copy function code
COPY app/* ${FUNCTION_DIR}/

# Install python dependencies and runtime interface client
RUN python${RUNTIME_VERSION} -m pip install \
                   --target ${FUNCTION_DIR} \
                   --no-cache-dir \
                   awslambdaric \
                   -r ${FUNCTION_DIR}/requirements.txt

# Install xgboost from source
RUN git clone --recursive https://github.com/dmlc/xgboost
RUN cd xgboost; make -j4; cd python-package; python${RUNTIME_VERSION} setup.py install; cd;

# Multi-stage build: grab a fresh copy of the base image
FROM base-image

# Include global arg in this stage of the build
ARG FUNCTION_DIR

# Set working directory to function root directory
WORKDIR ${FUNCTION_DIR}

# Copy in the build image dependencies
COPY --from=base-image ${FUNCTION_DIR} ${FUNCTION_DIR}

ENTRYPOINT [ "/usr/local/bin/python", "-m", "awslambdaric" ]

CMD [ "app.handler" ]

So I was never able to figure out why it failed in this way.所以我一直无法弄清楚为什么它会以这种方式失败。 The solution I found that worked was to create an EC2 instance running amazon linux, install and zip the libraries there, then save to S3.我发现可行的解决方案是创建一个运行 amazon linux 的 EC2 实例,在那里安装 zip 库,然后保存到 S3。 See here for detailed instructions:有关详细说明,请参见此处:

https://medium.com/@lucashenriquessilva/how-to-create-a-aws-lambda-python-layer-db2830e08b12 https://medium.com/@lucashenriquessilva/how-to-create-a-aws-lambda-python-layer-db2830e08b12

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM