I would like to read in multiple SST netcdf files and from each file extract the SST data in selected lat, lon range and then store this data in a three dimensional panda dataframe. Closing each netcdf after it has been read to save memory.
I would like to end with one dataframe of a years worth of daily data.
I have read one file with NetCDF4 and stored each variable but that is as far as I have got.
my_file = 'C:/Users/lisa/Desktop/Sean/20160719000127-UoS-L2i-SSTskin-ISAR_002-D054_PtA-v01.0-fv01.5.nc'
fh = netCDF4.Dataset(my_file, mode='r')
lon = fh.variables['lon'][:]
lat = fh.variables['lat'][:]
time = fh.variables['time'][:]
sst = fh.variables['sea_surface_temperature'][:]
The data is from OPeNDAP for 2016 from the following address.
http://www.ifremer.fr/opendap/cerdap1/ghrsst/l4/saf/odyssea-nrt/data/
Any help would be much appreciated!!
The Pandas.DataFrame
does not support 3-dimensional data in this way. This use case is exactly why xarray
was developed.
To do what you're trying to do in xarray
:
import xarray as xr
ds = xr.open_mfdataset(['file1.nc', 'file2.nc', 'file3.nc'])
This will concatentate your files together and put it all in one xarray.Dataset
. getting 1d or 2d data into Pandas is pretty easy
ds.sel(lat=36.0, lon=42.5).to_dataframe()
I would suggest preprocessing with CDO, eg
cdo mergetime 2016*-UoS-L2i-SSTskin-ISAR_002-D054_PtA-v01.0-fv01.5.nc merged.nc
cdo sellonlatbox,lon1,lon2,lat,lat2 merged.nc box_2016.nc
You may have a open file limit (256) on your system in which case you will need to split the mergetime command up into a loop over months, extract the area and then do a final mergetime on the 12 monthly files at the end.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.