简体   繁体   English

如何在matlab中确定矩阵每列中最后100个非零数的平均值

[英]How to determine the mean of the last e.g. 100 non-zero numbers in each column of a matrix in matlab

I would like to calculate the mean of the last eg 3 non-zero numbers in each column of a matrix in matlab. 我想在matlab中计算矩阵每列中最后三个非零数的平均值。 The columns were filled with zeros at the end to create vectors of the same length. 在末尾用列填充列以创建相同长度的向量。

example matrix: 示例矩阵:

A = [5 6 3 5 6 8 9;
     1 2 3 5 4 7 6;
     0 1 2 3 4 5 6; 
     0 0 1 2 3 4 5; 
     0 0 0 1 2 3 4; 
     0 0 0 0 2 3 4;
     0 0 0 0 2 3 4; 
     0 0 0 0 0 0 3]

There may be a more efficient solution but one way is to use sum to find the number of non-zero rows in a given column. 可能有一个更有效的解决方案,但一种方法是使用sum来查找给定列中的非零行数。 Then grab average the values of A by looping through all columns with arrayfun and averaging the N rows before the zero in the column. 然后通过使用arrayfun循环遍历所有列并在列中的零之前平均N行来获取A的平均值。

%// Number of elements to average
N = 3;

%// Last non-zero row in each column
lastrow = sum(A ~= 0, 1);

%// Ensure that we don't have any indices less than 1
startrow = max(lastrow - N + 1, 1);

%// Compute the mean for each column using the specified rows
means = arrayfun(@(k)mean(A(startrow(k):lastrow(k),k)), 1:size(A, 2));

Example

For your example data this would yield: 对于您的示例数据,这将产生:

3.0000    3.0000    2.0000    2.0000    2.0000    3.0000    3.6667

UPDATE: An Alternative 更新:替代方案

An alternate approach would be to use convolution to actually solve this for you. 另一种方法是使用卷积来实际为您解决此问题。 You can compute a mean using a convolution kernel. 您可以使用卷积内核计算均值。 If you want the mean of all 3-row combinations of an matrix, your kernel would be: 如果你想要矩阵的所有3行组合的平均值,你的内核将是:

kernel = [1; 1; 1] ./ 3;

When convolved with the matrix of interest, this will compute the average of all 3-row combinations within the input matrix. 当与感兴趣的矩阵卷积时,这将计算输入矩阵内所有3行组合的平均值。

B = [1 2 3;
     4 5 6;
     7 8 9];

conv2(B, kernel)

    0.3333    0.6667    1.0000
    1.6667    2.3333    3.0000
    4.0000    5.0000    6.0000
    3.6667    4.3333    5.0000
    2.3333    2.6667    3.0000

In the example below, I do this and then only return the values at the regions we care about (where the average is only composed of the last N non-zeros in each column) 在下面的示例中,我执行此操作,然后仅返回我们关心的区域的值(其中平均值仅由每列中的最后N非零组成)

%// Find the last non-zero entry in each column
lastrow = sum(A ~= 0, 1);

%// Use convolution to compute the mean for every N rows
%// This will be applied to ALL of A
convmean = conv2(A, ones(N, 1)./N);

%// Select only the means that we care about
%// Because of the padding of CONV2, these will live at the rows
%// stored in LASTROW
means = convmean(sub2ind(size(convmean), lastrow, 1:size(A, 2)));

%// Now correct for cases where fewer than N samples were averaged
means = (means * N) ./ min(lastrow, N);

And the output, again, is the same 而输出又是一样的

3.0000    3.0000    2.0000    2.0000    2.0000    3.0000    3.6667

Comparison 对照

I ran a quick test script to compare the performance between these two methods. 我运行了一个快速测试脚本来比较这两种方法之间的性能。 It is clear that the convolution-based approach is much faster. 很明显,基于卷积的方法要快得多。

在此输入图像描述

Here is the full test script. 这是完整的测试脚本。

function benchmark()
    dims = round(linspace(1, 1000, 100));

    times1 = zeros(size(dims));
    times2 = zeros(size(dims));

    N = 3;

    for k = 1:numel(dims)
        A = triu(rand(dims(k)));
        times1(k) = timeit(@()test_arrayfun(N, A));
        A = triu(rand(dims(k)));
        times2(k) = timeit(@()test_convolution(N, A));
    end

    figure;
    plot(dims, times1);
    hold on
    plot(dims, times2);

    legend({'arrayfun', 'convolution'})
    xlabel('Dimension of A')
    ylabel('Execution Time (seconds)')
end

function test_arrayfun(N, A)
    %// Last non-zero row in each column
    lastrow = sum(A ~= 0, 1);

    %// Ensure that we don't have any indices less than 1
    startrow = max(lastrow - N + 1, 1);

    %// Compute the mean for each column using the specified rows
    means = arrayfun(@(k)mean(A(startrow(k):lastrow(k),k)), 1:size(A, 2));
end

function test_convolution(N, A)
    %// Find the last non-zero entry in each column
    lastrow = sum(A ~= 0, 1);

    %// Use convolution to compute the mean for every N rows
    %// This will be applied to ALL of A
    convmean = conv2(A, ones(N, 1)./N);

    %// Select only the means that we care about
    %// Because of the padding of CONV2, these will live at the rows
    %// stored in LASTROW
    means = convmean(sub2ind(size(convmean), lastrow, 1:size(A, 2)));

    %// Now correct for cases where fewer than N samples were averaged
    means = (means * N) ./ min(lastrow, N);
end

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM