[英]Calculate running average of data recursively
我有兩個二維矩陣A
和B
,其中行表示試驗,列表示試驗期間收集的樣本。
我處於A
可用但B
實時收集的場景中。 我想計算 { A
的運行平均值和B
} 的可用數據,因為B
正在被采樣。 我想我可以通過計算A
和B
的加權平均值並在收集B
試驗和樣本時更新權重來實現這一點。 具體來說,我認為我可以更新權重並遞歸使用我已經從上一次迭代中保存的值。 下面是我的代碼和輸出圖:
close all;
clear all;
%define the sizes of the matrices -- exact numbers aren't important for illustration
n1 = 5;
n2 = 10;
n3 = 12;
%define a matrix that will act as the history of data already collected
A = randi(10,[n2,n1]);
A_avg = mean(A,1); %averaged across n2 trials to get n1 values
%current acts as "incoming" data
B = randi(10,[n3,n1]); %n3 trials, n1 samples per trial
%preallocate matrices for final solutions
correct_means = zeros(n3,n1);
estimated_means = zeros(n3,n1);
for k1=1:size(B,1) %loop through trials
%get running average in the case where we already have all samples
correct_means(k1,:) = mean([A;B([1:k1],:)],1);
for k2=1:size(B,2) %k2 should loop through samples
%calculate averages as samples are incoming recursively (weighted averaging)
if k1>1
estimated_means(k1,k2) = (n2 / (n2+k1)) * A_avg(k2)...
+ ((k1-1)/(n2+k1)) * estimated_means(k1-1,k2) + (1/(n2+k1)) * B(k1,k2);
elseif k1==1
estimated_means(k1,k2) = (n2 / (n2+k1)) * A_avg(k2)...
+ ((k1-1)/(n2+k1)) * estimated_means(k1,k2) + (1/(n2+k1)) * B(k1,k2);
end
% if k1==2, keyboard; end
end
end
%plot the results
figure; hold on;
plot(nan, 'b', 'displayname', 'correct solution');
plot(nan, 'k--', 'displayname', 'my solution');
leg_tmp = legend('show');
set(leg_tmp,'Location','Best');
plot(correct_means, 'b', 'displayname', 'correct solution');
plot(estimated_means, 'k--', 'displayname', 'my solution');
ylabel('running averages');
xlabel('samples');
所附的情節概述了我嘗試的解決方案(黑色)以及我認為正確的答案(藍色)。 請注意,我只是在獲取所有試驗的所有樣本后繪制平均值,但我在收集數據時保存了運行平均值。 正如你所看到的 - 我的答案似乎有點偏離。
我的想法是A
應該通過用於確定它是收集B
時試驗總數的平均值的試驗的分數來更新。 同樣, B
的當前樣本的權重只是 1 除以迭代中當前的試驗總數,並且B
的先前樣本被遞歸調用並相應地加權。 這些權重加起來為 1,對我來說很有意義,所以我很難看出我在哪里搞砸了。
誰能看到我哪里搞砸了?
您應該考慮到代碼越長,它往往會積累越多的錯誤。
我提前道歉,對我來說,重寫業務邏輯比在代碼中找到錯誤更容易 - 所以雖然結果可能不是你想要的,但它確實提供了一個修復。 我希望你仍然會發現這很有用。
請看一個稍微簡化的版本,它似乎可以產生正確的結果:
function q60180320
rng(60180320); % For reproducibility
% define the sizes of the matrices
n1 = 5;
n2 = 10;
n3 = 12;
% define a matrix that will act as the history of data already collected
A = randi(10,[n2,n1]);
A_avg = mean(A,1); %averaged across n2 trials to get n1 values
% current acts as "incoming" data
B = randi(10,[n3,n1]); %n3 trials, n1 samples per trial
correct_means = cumsum([A;B],1)./(1:n2+n3).'; correct_means = correct_means(n2+1:end,:);
% preallocate matrices
estimated_means = zeros(n3+1,n1);
estimated_means(1,:) = A_avg; % trick to avoid an if-clause inside the loop
for k1 = 1:size(B,1) % Loop through trials
%% Compute weights:
totalRows = n2 + k1;
W_old = (totalRows - 1)./totalRows;
W_new = 1/totalRows;
%% Incoming measurement (assuming an entire row of B becomes available at a time)
newB = B(k1,:);
%% Compute running average
estimated_means(k1+1,:) = W_old * estimated_means(k1,:) + W_new * newB;
end
estimated_means = estimated_means(2:end,:); % remove the first row;
% plot the results
figure; hold on;
plot(nan, 'k', 'displayname', 'correct solution');
plot(nan, 'w--', 'displayname', 'my solution');
leg_tmp = legend('show');
set(leg_tmp,'Location','EastOutside');
plot(correct_means, 'k', 'displayname', 'correct solution', 'LineWidth', 2);
plot(estimated_means, 'w--', 'displayname', 'my solution');
ylabel('running averages');
xlabel('samples');
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.