I'm trying to download NLTK data onto the file storage of a Lambda function like so:
nltk.data.path.append("/tmp")
nltk.download("popular", download_dir="/tmp")
The Lambda function keeps timing out. When I check the Cloudwatch logs, I see no logs related to the download of different corpora files (eg Downloading package cmudict to /tmp...
; instead the code seems to reach up to nltk.download()
, then hang forever.
Has anyone seen this strange behavior?
There are several limitations (or maybe rather concepts) of Lambdas that are colliding with what you're trying to do here:
If you need data to be available to your Lambda function, the easiest way to go is probably to use an S3 bucket to store the data. You can find a detailed example of how to do that here (credits to Alexey Smirnov ).
Got it: My Lambda function was running in a VPC. I had to add an endpoint to enable the VPC to access S3.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.