简体   繁体   English

conda 环境到 AWS Lambda

[英]conda environment to AWS Lambda

I would like to set up a Python function I've written on AWS Lambda, a function that depends on a bunch of Python libraries I have already collected in a conda environment .我想设置一个我在 AWS Lambda 上编写的 Python 函数,该函数依赖于我已经在conda 环境中收集的一堆 Python 库。

To set this up on Lambda, I'm supposed to zip this environment up, but the Lambda docs only give instructions for how to do this using pip/VirtualEnv.要在 Lambda 上进行设置,我应该压缩此环境,但Lambda 文档仅提供有关如何使用 pip/VirtualEnv 执行此操作的说明。 Does anyone have experience with this?有没有人有这方面的经验?

You should use the serverless framework in combination with the serverless-python-requirements plugin .您应该将serverless 框架serverless-python-requirements 插件结合使用。 You just need a requirements.txt and the plugin automatically packages your code and the dependencies in a zip-file, uploads everything to s3 and deploys your function.您只需要一个requirements.txt ,插件会自动将您的代码和依赖项打包到一个 zip 文件中,将所有内容上传到 s3 并部署您的函数。 Bonus: Since it can do this dockerized, it is also able to help you with packages that need binary dependencies.奖励:由于它可以通过 dockerized 来执行此操作,因此它还可以帮助您处理需要二进制依赖项的包。

Have a look here (https://serverless.com/blog/serverless-python-packaging/) for a how-to.看看这里 (https://serverless.com/blog/serverless-python-packaging/)的操作方法。

From experience I strongly recommend you look into that.根据经验,我强烈建议您研究一下。 Every bit of manual labour for deployment and such is something that keeps you from developing your logic.部署的每一点体力劳动都会阻止您开发逻辑。

Edit 2017-12-17:编辑 2017-12-17:

Your comment makes sense @eelco-hoogendoorn .你的评论是有道理的@eelco-hoogendoorn

However, in my mind a conda environment is just an encapsulated place where a bunch of python packages live.然而,在我看来,conda 环境只是一个封装了一堆 python 包的地方。 So, if you would put all these dependencies (from your conda env) into a requirements.txt (and use serverless + plugin) that would solve your problem, no?因此,如果您将所有这些依赖项(来自您的 conda 环境)放入一个requirements.txt (并使用无服务器 + 插件)来解决您的问题,不是吗?
IMHO it would essentially be the same as zipping all the packages you installed in your env into your deployment package.恕我直言,它本质上与将您安装在 env 中的所有包压缩到部署包中是一样的。 That being said, here is a snippet, which does essentially this:话虽如此,这是一个片段,它基本上是这样的:

conda env export --name Name_of_your_Conda_env | yq -r '.dependencies[] | .. | select(type == "string")' | sed -E "s/(^[^=]*)(=+)([0-9.]+)(=.*|$)/\1==\3/" > requirements.txt

Unfortunately conda env export only exports the environment in yaml format.不幸的是conda env export仅以 yaml 格式conda env export环境。 The --json flag doesn't work right now, but is supposed to be fixed in the next release. --json标志现在不起作用,但应该在下一个版本中修复。 That is why I had to use yq instead of jq .这就是为什么我必须使用yq而不是jq You can install yq using pip install yq .您可以安装yq使用pip install yq It is just a wrapper around jq to allow it to also work with yaml files.它只是jq一个包装器,允许它也可以处理 yaml 文件。

KEEP IN MIND记住

Lambda deployment code can only be 50MB in size. Lambda 部署代码的大小只能为 50MB。 Your environment shouldn't be too big.你的环境不应该太大。

I have not tried deploying a lambda with serverless + serverless-python-packaging and a requirements.txt created like that and I don't know if it will work.我还没有尝试过使用serverless + serverless-python-packaging和这样创建的requirements.txt部署 lambda,我不知道它是否会起作用。

The main reason why I use conda is an option not to compile different binary packages myself (like numpy , matplotlib , pyqt , etc.) or compile them less frequently.我使用conda主要原因是选择不自己编译不同的二进制包(如numpymatplotlibpyqt等)或不那么频繁地编译它们。 When you do need to compile something yourself for the specific version of python (like uwsgi ), you should compile the binaries with the same gcc version that the python within your conda environment is compiled with - most probably it is not the same gcc that your OS is using, since conda is now using the latest versions of the gcc that should be installed with conda install gxx_linux-64 .当您确实需要为特定版本的python (如uwsgi )编译某些东西时,您应该使用与您的conda环境中的python编译时使用的gcc版本相同的gcc版本编译二进制文件 - 很可能它与您的gcc操作系统正在使用,因为conda现在正在使用最新版本的gcc ,该版本应该与conda install gxx_linux-64一起conda install gxx_linux-64

This leads us to two situations:这导致我们出现两种情况:

  1. All you dependencies are in pure python and you can actually save a list of list of them using pip freeze and bundle them as it is stated for virtualenv .您所有的依赖项都在纯 python 中,您实际上可以使用pip freeze保存它们的列表,并按照virtualenv规定捆绑它们。

  2. You have some binary extensions.你有一些二进制扩展。 In that case, the the binaries from your conda environment will not work with the python used by AWS lambda.在这种情况下,您的 conda 环境中的二进制文件将无法与 AWS lambda 使用的 Python 一起使用。 Unfortunately, you will need to visit the page describing the execution environment (AMI: amzn-ami-hvm-2017.03.1.20170812-x86_64-gp2), set up the environment, build the binaries for the specific version of built-in python in a separate directory (as well as pure python packages), and then bundle them into a zip-archive.不幸的是,您需要访问描述执行环境的页面(AMI:amzn-ami-hvm-2017.03.1.20170812-x86_64-gp2),设置环境,为特定版本的内置python构建二进制文件单独的目录(以及纯 python 包),然后将它们捆绑到一个 zip 存档中。

This is a general answer to your question, but the main idea is that you can not reuse your binary packages, only a list of them.这是您问题的一般答案,但主要思想是您不能重用二进制包,只能重用它们的列表。

I can't think of a good reason why zipping up your conda environment wouldn't work.我想不出一个很好的理由为什么压缩你的 conda 环境不起作用。

I thik you can go into your anaconda2/envs/ or anaconda3/envs/ directory and copy/zip the env-directory you want to upload.我认为您可以进入anaconda2/envs/anaconda3/envs/目录并复制/压缩要上传的 env 目录。 Conda is just a souped-up version of a virtualenv, plus a different & somewhat optional package-manager. Conda 只是一个加强版的 virtualenv,加上一个不同的、有点可选的包管理器。 The big reason I think it's ok is that conda environments encapsulate all their dependencies within their particular .../anaconda[2|3]/envs/$VIRTUAL_ENV_DIR/ directories by default.我认为没问题的一个重要原因是 conda 环境默认将所有依赖项封装在它们特定的.../anaconda[2|3]/envs/$VIRTUAL_ENV_DIR/目录中。

Using the normal virtualenv expression gives you a bit more freedom, in sort of the same way that cavemen had more freedom than modern people.使用普通的 virtualenv 表达式会给你更多的自由,就像穴居人比现代人拥有更多的自由一样。 Personally I prefer cars.我个人更喜欢汽车。 With virtualenv you basically get a semi-empty $PYTHON_PATH variable that you can fill with whatever you want, rather than the more robust, pre-populated env that Conda spits out.使用 virtualenv,你基本上会得到一个半空的$PYTHON_PATH变量,你可以用你想要的任何东西来填充它,而不是 Conda 吐出的更强大、预先填充的 env。 The following is a good table for reference: https://conda.io/docs/commands.html#conda-vs-pip-vs-virtualenv-commands下面是一个很好的表供参考: https : //conda.io/docs/commands.html#conda-vs-pip-vs-virtualenv-commands

Conda turns the command ~$ /path/to/$VIRTUAL_ENV_ROOT_DIR/bin/activate into ~$ source activate $VIRTUAL_ENV_NAME Conda 将命令~$ /path/to/$VIRTUAL_ENV_ROOT_DIR/bin/activate变成~$ source activate $VIRTUAL_ENV_NAME

Say you want to make a virtualenv the old fashioned way.假设您想以老式的方式制作virtualenv You'd choose a directory (let's call it $VIRTUAL_ENV_ROOT_DIR ,) & name (which we'll call $VIRTUAL_ENV_NAME .) At this point you would type:您将选择一个目录(我们称之为$VIRTUAL_ENV_ROOT_DIR ,)和名称(我们称之为$VIRTUAL_ENV_NAME 。)此时您将输入:

~$ cd $VIRTUAL_ENV_ROOT_DIR && virtualenv $VIRTUAL_ENV_NAME

python then creates a copy of it's own interpreter library (plus pip and setuptools I think) & places an executable called activate in this clone's bin/ directory. python 然后创建它自己的解释器库的副本(加上我认为的 pip 和 setuptools)并将一个名为activate的可执行文件放在这个克隆的bin/目录中。 The $VIRTUAL_ENV_ROOT_DIR/bin/activate script works by changing your current $PYTHONPATH environment variable, which determines what python interpreter gets called when you type ~$ python into the shell, & also the list of directories containing all modules which the interpreter will see when it is told to import something. $VIRTUAL_ENV_ROOT_DIR/bin/activate脚本通过更改您当前的$PYTHONPATH环境变量来工作,该变量决定了当您在 shell 中键入~$ python时调用的 Python 解释器,以及包含解释器将在何时看到的所有模块的目录列表它被告知要import一些东西。 This is the primary reason you'll see #!/usr/bin/env python in people's code instead of /usr/bin/python .这是您将在人们的代码中看到#!/usr/bin/env python而不是/usr/bin/python主要原因。

In https://github.com/dazza-codes/aws-lambda-layer-packing , the pip wheels seem to be working for many packages (pure-pip installs).https://github.com/dazza-codes/aws-lambda-layer-packing 中,pip 轮子似乎适用于许多软件包(纯 pip 安装)。 It is difficult to bundle a lot of packages into a compact AWS Lambda layer, since pip wheels do not use shared libraries and tend to get bloated a bit, but they work.很难将大量包捆绑到一个紧凑的 AWS Lambda 层中,因为 pip Wheel 不使用共享库并且往往会变得有些臃肿,但它们可以工作。 Based on some discussions in github, the conda vs. pip challenges are not trivial:根据 github 中的一些讨论,conda 与 pip 的挑战并非微不足道:

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM