简体   繁体   中英

correlate gridded data sets in python

I have two monthly global gridded data sets of liquid water equivalent thickness in the format (time, lats, lons). Both have the same spatial and temporal resolution. I want to correlate them but numpy.corrcoef() only works with 2D arrays, not with 3D. So I want to correlate the same grid point (x,y) of both variables for the whole time series. In fact I want a new nc file with a grid of correlation coefficients.

import numpy as np
from netCDF4 import Dataset

wdir  = '.../Data/'

# read GRACE NCs
GRACE_GFZ = Dataset(wdir+'GRACE/GRCTellus.GFZ.200204_201607.nc','r')
GRACE_JPL = Dataset(wdir+'GRACE/GRCTellus.JPL.200204_201607.nc','r')

Both variables (gfz and jpl) have the same amount of missing values at the same locations.

GRACE_GFZ.variables['lwe_thickness']
   <type 'netCDF4._netCDF4.Variable'>
   float32 lwe_thickness(time, lat, lon)
      long_name: Liquid_Water_Equivalent_Thickness
      units: cm
      _FillValue: 32767.0
      missing_value: 32767.0
   unlimited dimensions: time
   current shape = (155, 72, 144)
   filling off

GRACE_JPL.variables['lwe_thickness']
   <type 'netCDF4._netCDF4.Variable'>
   float32 lwe_thickness(time, lat, lon)
      long_name: Liquid_Water_Equivalent_Thickness
      units: cm
      _FillValue: 32767.0
      missing_value: 32767.0
   unlimited dimensions: time
   current shape = (155, 72, 144)
   filling off

As they have the same temporal and spatial resolution, time, longitude and latitude from one can be used for both.

time = GRACE_GFZ.variables['time'][:]
lons = GRACE_GFZ.variables['lon'][:]
lats = GRACE_GFZ.variables['lat'][:]
gfz = GRACE_GFZ.variables['lwe_thickness'][:]
jpl = GRACE_JPL.variables['lwe_thickness'][:]

Here I want to go through the grid and put the corrcoef in an array. This gives me a bunch of 2x2 matrices.

test = []
for x in range(len(lats)):
   for y in range(len(lons)):
      print(np.corrcoef(gfz[:,x,y],jpl[:,x,y]))

How can I put the corrcoef of each point into a new array at the right spot?

I overcome my problem with the following:

temp =[]
corrcoefMatrix_gfzjpl = [[0 for i in range(len(lons))] for j in range(len(lats))] 
for x in range(len(lats)):
    for y in range(len(lons)):
        temp = np.corrcoef(gfz[:,x,y],jpl[:,x,y])
        corrcoefMatrix_gfzjpl[x][y] = temp[0,1]

corrcoefMatrix_gfzjpl = np.squeeze(np.asarray(corrcoefMatrix_gfzjpl))

Basically I made a matrix containing zeros and replaced them with the correlation coefficient value from the corrcoef matrix. I did this for each grid cell by going trough the lats and lons with a for loop for each. Afterwards I created a new netcdf file, defined all dimensions and variables.

nc_data.createDimension('lat', len(lats))
nc_data.createDimension('lon', len(lons))
nc_data.createDimension('data', 1)

latitudes  = nc_data.createVariable('latitude', 'f4', ('lat',))
longitudes = nc_data.createVariable('longitude', 'f4', ('lon',))
corrcoef_gfzjpl   = nc_data.createVariable('corrcoef_gfzjpl', 'f4', ('lat',   'lon', 'data'), fill_value=-999.0)

latitudes.units = 'degree_north'
longitudes.units = 'degree_east'
latitudes[:]  = np.arange(-90, 90, 180./len(lats))
longitudes[:] = np.arange(0, 360, 360./len(lons))
corrcoef_gfzjpl[:,:]   = corrcoefMatrix_gfzjpl[:,:]

Suggestions for improvements are very welcome!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM