[英]NetCDF4 file with Python - Filter before dataframing
Due to a large NetCDF4 file, I get a MemoryError when I want to transform it into Pandas dataframe. But I don't need everything from the.netCDF4 file, so I wanted to know if I could cut the file priorly, and after transforming into dataframe由于 NetCDF4 文件很大,当我想将其转换为 Pandas dataframe 时出现 MemoryError。但是我不需要 .netCDF4 文件中的所有内容,所以我想知道是否可以先剪切文件,然后再转换进入 dataframe
My file looks like this:我的文件如下所示:
xr is for the xarray library Time variable contains all hours from 2019-01-01 to 2019-01-31 Unfortunately I can't filter on Copernicus website but I only need time at 09:00:00 xr 用于 xarray 库时间变量包含从 2019-01-01 到 2019-01-31 的所有时间不幸的是我无法在哥白尼网站上过滤但我只需要 09:00:00 的时间
Do you know how I could do it?你知道我该怎么做吗? Using xarray library or other way.
使用 xarray 库或其他方式。
Thanks谢谢
You can use sel
to filter your dataset:您可以使用
sel
来过滤您的数据集:
import pandas as pd
import xarray as xr
import datetime
# Load a demo dataset
ds = xr.tutorial.load_dataset('air_temperature')
# Keep only 12:00 rows
df = ds.sel(time=datetime.time(12)).to_dataframe()
Output: Output:
>>> df
air
lat time lon
75.0 2013-01-01 12:00:00 200.0 242.299988
202.5 242.199997
205.0 242.299988
207.5 242.500000
210.0 242.889999
... ...
15.0 2014-12-31 12:00:00 320.0 296.889984
322.5 296.589996
325.0 295.690002
327.5 295.489990
330.0 295.190002
[967250 rows x 1 columns]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.