简体   繁体   中英

Read multiple .nc files into a 3D pandas dataframe python

I would like to read in multiple SST netcdf files and from each file extract the SST data in selected lat, lon range and then store this data in a three dimensional panda dataframe. Closing each netcdf after it has been read to save memory.

I would like to end with one dataframe of a years worth of daily data.

I have read one file with NetCDF4 and stored each variable but that is as far as I have got.

my_file = 'C:/Users/lisa/Desktop/Sean/20160719000127-UoS-L2i-SSTskin-ISAR_002-D054_PtA-v01.0-fv01.5.nc'
fh = netCDF4.Dataset(my_file, mode='r')
lon = fh.variables['lon'][:]
lat = fh.variables['lat'][:]
time = fh.variables['time'][:]
sst = fh.variables['sea_surface_temperature'][:]

The data is from OPeNDAP for 2016 from the following address.

http://www.ifremer.fr/opendap/cerdap1/ghrsst/l4/saf/odyssea-nrt/data/

Any help would be much appreciated!!

The Pandas.DataFrame does not support 3-dimensional data in this way. This use case is exactly why xarray was developed.

To do what you're trying to do in xarray :

import xarray as xr

ds = xr.open_mfdataset(['file1.nc', 'file2.nc', 'file3.nc'])

This will concatentate your files together and put it all in one xarray.Dataset . getting 1d or 2d data into Pandas is pretty easy

ds.sel(lat=36.0, lon=42.5).to_dataframe()

I would suggest preprocessing with CDO, eg

cdo mergetime 2016*-UoS-L2i-SSTskin-ISAR_002-D054_PtA-v01.0-fv01.5.nc merged.nc
cdo sellonlatbox,lon1,lon2,lat,lat2 merged.nc box_2016.nc

You may have a open file limit (256) on your system in which case you will need to split the mergetime command up into a loop over months, extract the area and then do a final mergetime on the 12 monthly files at the end.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM