简体   繁体   English

使用数据中的 nans 计算 3 arrays 的加权平均值 python

[英]calculate weighted mean of 3 arrays with nans in data python

I have 3, two dimensional arrays that represent geospatial data.我有 3 个二维 arrays 代表地理空间数据。 Each array shape is (721,1440) , ie, 721 latitude values and 1440 longitude values.每个数组形状是(721,1440) ,即 721 个纬度值和 1440 个经度值。 I want to compute a weighted mean of these 3 arrays.我想计算这 3 个 arrays 的加权平均值。 Normally that is simple and would generally be sum(array*weight)/sum(weights).通常这很简单,通常是 sum(array*weight)/sum(weights)。 This works great except in cases where you have nans in the data.这很好用,除非数据中有 nan。

In my specific case arr1 should have a weight of 0.7, arr2 0.2, and arr3 0.1.在我的具体情况下, arr1的权重应为 0.7、 arr2 0.2 和arr3 0.1。 However, anytime there is a nan, the mean obviously becomes nan.但是,只要有 nan,均值显然会变成 nan。 In my case the only data with nans is arr3 .在我的情况下,唯一带有 nans 的数据是arr3

What I want though is when there is a nan for the weighted mean to only comprise of the first two arrays, which would be (arr1*0.7 + arr2*0.2)/0.9 .我想要的是,当加权平均值的 nan 仅包含前两个 arrays 时,即(arr1*0.7 + arr2*0.2)/0.9 I tried using xr.where() to accomplish this but for some reason it goes hog wild on my RAM and crashes my kernel every single time.我尝试使用xr.where()来完成此操作,但由于某种原因,它在我的 RAM 上变得疯狂,并且每次都会使我的 kernel 崩溃。 Are there any other ways to accomplish this task?还有其他方法可以完成这项任务吗?

You can use np.nansum() and np.isnan() :您可以使用np.nansum()np.isnan()

import numpy as np

# Dummy example
x = np.ones((5,5))
y = np.ones((5,5))*2
x[0,0] = np.nan

# Stack your array 
stack  = np.stack((x,y))
# Compute the weight for each value:                 
weight = np.apply_along_axis(np.multiply,0,~np.isnan([x,y]),[0.2,0.8])
# Get the result
res    = np.nansum(stack*weight,axis=0)/weight.sum(axis=0)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM