简体   繁体   中英

Compute mean of columns for groups of rows in Octave

I have a matrix, for example:

1 2
3 4
4 5

And I also have a rule of grouping the rows, which is defined as a vector of group IDs like this:

1
2
1

Which means that the first and the third rows belong to the same group (ID 1) and the second row belong to another group (ID 2). So, I would like to compute the mean value for each group. Here is the result for my example:

2.5 3.5
3 4

More formally, there is a matrix A of size ( m , n ), a number of groups k and a vector v of size ( m , 1), values of which are integers in range from 1 to k . The result is a matrix R of size ( k , n ), where each row with index r corresponds to the mean value of the group r .

Here is my solution (which does what I need) using for-loop in Octave:

R = zeros(k, n);
for r = 1:k
    R(r, :) = mean(A((v == r), :), 1);
end

I wonder whether it could be vectorized. So, what I need is to replace the for-loop with a vectorized solution, which is going to be much more efficient than the iterative one.

Here is one of my many attempts (which do not work) to solve the problem in a vectorized way:

R = mean(A((v == 1:k), :);

As long as our data is of floating point, you can just do it manually by doing the sum yourself and then divide, by making use of accumdim . Like so:

octave:1> A = [1 2; 3 4; 4 5];
octave:2> subs = [1; 2; 1];
octave:3> accumdim (subs, A) ./ accumdim (subs, ones (rows (subs), 1))
ans =

   2.5000   3.5000
   3.0000   4.0000

You can consider it as a matrix multiplication problem. For instance, for your example this corresponds to

A = [1 2; 3 4; 4 5];
B = [0.5,0,0.5;0,1,0];

C = B*A

The main issue, is to construct B from your list of indicies in an efficient manner. My suggestion is to use the implicit expansion of == .

A = [1 2; 3 4; 4 5]; % Input data
idx = [1;2;1]; % Input Grouping

k = 2; % number of groups, ( = max(idx) )
m = 3; % Number of "observations"
Btmp = (idx == 1:k)'; % Mark locations
B = Btmp ./sum(Btmp,2); % Normalise
C = B*A

C =

    2.5000    3.5000
    3.0000    4.0000

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM