带有 Python 的 NetCDF4 文件 - 在数据帧之前进行过滤

Question

Due to a large NetCDF4 file, I get a MemoryError when I want to transform it into Pandas dataframe. But I don't need everything from the.netCDF4 file, so I wanted to know if I could cut the file priorly, and after transforming into dataframe由于 NetCDF4 文件很大，当我想将其转换为 Pandas dataframe 时出现 MemoryError。但是我不需要 .netCDF4 文件中的所有内容，所以我想知道是否可以先剪切文件，然后再转换进入 dataframe

My file looks like this:我的文件如下所示：

xr is for the xarray library Time variable contains all hours from 2019-01-01 to 2019-01-31 Unfortunately I can't filter on Copernicus website but I only need time at 09:00:00 xr 用于 xarray 库时间变量包含从 2019-01-01 到 2019-01-31 的所有时间不幸的是我无法在哥白尼网站上过滤但我只需要 09:00:00 的时间

Do you know how I could do it?你知道我该怎么做吗？ Using xarray library or other way.使用 xarray 库或其他方式。

Thanks谢谢

Answer 1

You can use sel to filter your dataset:您可以使用sel来过滤您的数据集：

import pandas as pd
import xarray as xr
import datetime

# Load a demo dataset
ds = xr.tutorial.load_dataset('air_temperature')

# Keep only 12:00 rows
df = ds.sel(time=datetime.time(12)).to_dataframe()

Output: Output：

>>> df
                                       air
lat  time                lon              
75.0 2013-01-01 12:00:00 200.0  242.299988
                         202.5  242.199997
                         205.0  242.299988
                         207.5  242.500000
                         210.0  242.889999
...                                    ...
15.0 2014-12-31 12:00:00 320.0  296.889984
                         322.5  296.589996
                         325.0  295.690002
                         327.5  295.489990
                         330.0  295.190002

[967250 rows x 1 columns]

带有 Python 的 NetCDF4 文件 - 在数据帧之前进行过滤

问题描述

1 个解决方案

解决方案1
4 已采纳 2023-01-12 10:20:12

带有 Python 的 NetCDF4 文件 - 在数据帧之前进行过滤

问题描述

1 个解决方案

解决方案1 4 已采纳 2023-01-12 10:20:12

解决方案1
4 已采纳 2023-01-12 10:20:12