繁体   English   中英

递归计算数据的运行平均值

[英]Calculate running average of data recursively

我有两个二维矩阵AB ,其中行表示试验,列表示试验期间收集的样本。

我处于A可用但B实时收集的场景中。 我想计算 { A的运行平均值和B } 的可用数据,因为B正在被采样。 我想我可以通过计算AB的加权平均值并在收集B试验和样本时更新权重来实现这一点。 具体来说,我认为我可以更新权重并递归使用我已经从上一次迭代中保存的值。 下面是我的代码和输出图:

close all;
clear all;

%define the sizes of the matrices -- exact numbers aren't important for illustration
n1 = 5;
n2 = 10;
n3 = 12;

%define a matrix that will act as the history of data already collected
A = randi(10,[n2,n1]);
A_avg = mean(A,1); %averaged across n2 trials to get n1 values

%current acts as "incoming" data
B = randi(10,[n3,n1]); %n3 trials, n1 samples per trial

%preallocate matrices for final solutions
correct_means = zeros(n3,n1);
estimated_means = zeros(n3,n1);

for k1=1:size(B,1) %loop through trials
    %get running average in the case where we already have all samples
    correct_means(k1,:) = mean([A;B([1:k1],:)],1);
    for k2=1:size(B,2) %k2 should loop through samples
        %calculate averages as samples are incoming recursively (weighted averaging)
        if k1>1
            estimated_means(k1,k2) = (n2 / (n2+k1)) * A_avg(k2)...
                + ((k1-1)/(n2+k1)) * estimated_means(k1-1,k2) + (1/(n2+k1)) * B(k1,k2);
        elseif k1==1
            estimated_means(k1,k2) = (n2 / (n2+k1)) * A_avg(k2)...
                + ((k1-1)/(n2+k1)) * estimated_means(k1,k2) + (1/(n2+k1)) * B(k1,k2);       
        end
%         if k1==2, keyboard; end
    end
end

%plot the results
figure; hold on;
plot(nan, 'b', 'displayname', 'correct solution');
plot(nan, 'k--', 'displayname', 'my solution');
leg_tmp = legend('show');
set(leg_tmp,'Location','Best');

plot(correct_means, 'b', 'displayname', 'correct solution');
plot(estimated_means, 'k--', 'displayname', 'my solution');

ylabel('running averages');
xlabel('samples');

在此处输入图片说明

所附的情节概述了我尝试的解决方案(黑色)以及我认为正确的答案(蓝色)。 请注意,我只是在获取所有试验的所有样本后绘制平均值,但我在收集数据时保存了运行平均值。 正如你所看到的 - 我的答案似乎有点偏离。

我的想法是A应该通过用于确定它是收集B时试验总数的平均值的试验的分数来更新。 同样, B的当前样本的权重只是 1 除以迭代中当前的试验总数,并且B的先前样本被递归调用并相应地加权。 这些权重加起来为 1,对我来说很有意义,所以我很难看出我在哪里搞砸了。

谁能看到我哪里搞砸了?

您应该考虑到代码越长,它往往会积累越多的错误。

我提前道歉,对我来说,重写业务逻辑比在代码中找到错误更容易 - 所以虽然结果可能不是你想要的,但它确实提供了一个修复。 我希望你仍然会发现这很有用。

请看一个稍微简化的版本,它似乎可以产生正确的结果:

function q60180320
rng(60180320); % For reproducibility

% define the sizes of the matrices
n1 = 5;
n2 = 10;
n3 = 12;

% define a matrix that will act as the history of data already collected
A = randi(10,[n2,n1]);
A_avg = mean(A,1); %averaged across n2 trials to get n1 values

% current acts as "incoming" data
B = randi(10,[n3,n1]); %n3 trials, n1 samples per trial
correct_means = cumsum([A;B],1)./(1:n2+n3).'; correct_means = correct_means(n2+1:end,:);

% preallocate matrices
estimated_means = zeros(n3+1,n1);
estimated_means(1,:) = A_avg; % trick to avoid an if-clause inside the loop

for k1 = 1:size(B,1) % Loop through trials
  %% Compute weights:
  totalRows = n2 + k1;
  W_old = (totalRows - 1)./totalRows;
  W_new = 1/totalRows;

  %% Incoming measurement (assuming an entire row of B becomes available at a time)
  newB = B(k1,:); 

  %% Compute running average
  estimated_means(k1+1,:) = W_old * estimated_means(k1,:) + W_new * newB;
end
estimated_means = estimated_means(2:end,:); % remove the first row;

% plot the results
figure; hold on;
plot(nan, 'k', 'displayname', 'correct solution');
plot(nan, 'w--', 'displayname', 'my solution');
leg_tmp = legend('show');
set(leg_tmp,'Location','EastOutside');

plot(correct_means, 'k', 'displayname', 'correct solution', 'LineWidth', 2);
plot(estimated_means, 'w--', 'displayname', 'my solution');

ylabel('running averages');
xlabel('samples');

在此处输入图片说明

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM