[英]how to read S3 files in lambda using Xarray?
I am trying to read netCDF files placed in my S3 bucket, I am using Xarray to read the files.我正在尝试读取放置在我的 S3 存储桶中的 netCDF 文件,我正在使用 Xarray 来读取这些文件。 Below sample code runs fine, if I have the same file in my local folder like
~/downloads/60e0489fcab82c714f516064b4e6b7acf724b7b9.nc
but i am new to S3 and not sure what am i missing.下面的示例代码运行良好,如果我的本地文件夹中有相同的文件,如
~/downloads/60e0489fcab82c714f516064b4e6b7acf724b7b9.nc
但我是 S3 新手,不确定我缺少什么。
I am trying to read netCDF via Xarray and convert it to csv.我正在尝试通过 Xarray 读取 netCDF 并将其转换为 csv。 Boto3 doesn`t work for reading netCDF4 and converting it to CSV.
Boto3 不适用于读取 netCDF4 并将其转换为 CSV。
Below is my lambda function: -下面是我的 lambda function:-
import xarray
def handler(event, context):
filename = 's3://netcdf-files/60e0489fcab82c714f516064b4e6b7acf724b7b9.nc'
ds= xarray.open_dataset(filename)
for varname in ds:
print(varname)
tas0=ds['wet_bulb_potential_temperature']
tas0
return {
'statusCode': 200,
'message': 'Hello from Python Lambda Function!'
}
I am getting below error, my S3 file path isn`t detected instead its Lambda is trying to find the file in local path.我遇到以下错误,我的 S3 文件路径未检测到,而是它的 Lambda 正在尝试在本地路径中查找文件。 Error message from cloud watch logs:
来自云观察日志的错误消息:
File "/opt/python/lib/python3.6/site-packages/xarray/backends/file_manager.py", line 204, in _acquire_with_cache_info
file = self._opener(*self._args, **kwargs)
File "netCDF4/_netCDF4.pyx", line 2321, in netCDF4._netCDF4.Dataset.__init__
File "netCDF4/_netCDF4.pyx", line 1885, in netCDF4._netCDF4._ensure_nc_success
FileNotFoundError: [Errno 2] No such file or directory: b'/var/task/s3:/netcdf-files/60e0489fcab82c714f516064b4e6b7acf724b7b9.nc'
As far as I know, Xarray do not support S3 directly.据我所知,Xarray 不直接支持 S3。 You can use s3fs instead:
您可以改用s3fs :
import xarray
import s3fs
def handler(event, context):
fs = s3fs.S3FileSystem(anon=True) # or anon=False to use default credentials
with fs.open('netcdf-files/60e0489fcab82c714f516064b4e6b7acf724b7b9.nc', 'rb') as f:
ds= xarray.open_dataset(filename)
for varname in ds:
print(varname)
tas0=ds['wet_bulb_potential_temperature']
tas0
return {
'statusCode': 200,
'message': 'Hello from Python Lambda Function!'
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.