简体   繁体   中英

Mablab/Octave - use cellfun to index one matrix with another

I have a cell containing a random number of matrices, say a = {[300*20],....,[300*20]}; . I have another cell of the same format, call it b , that contains the logicals of the position of the nan terms in a .

I want to use cellfun to loop through the cell and basically let the nan terms equal to 0 ie a(b)=0 .

Thanks, j

You could define a function that replaces any NaN with zero.

function a = nan2zero(a)
  a(isnan(a)) = 0;

Then you can use cellfun to apply this function to your cell array.

a0 = cellfun(@nan2zero, a, 'UniformOutput', 0)

That way, you don't even need any matrices b .

First, you should probably give the tick to @s.bandara, as that was the first correct answer and it used cellfun (as you requested). Do NOT give it to this answer. The purpose of this answer is to provide some additional analysis.

I thought I'd look into the efficiency of some of the possible approaches to this problem.

The first approach is the one advocated by @s.bandara.

The second approach is similar to the one advocated by @s.bandara, but it uses b to convert nan to 0 , rather than using isnan . In theory, this method may be faster, since nothing is assigned to b inside the function, so it should be treated "By Ref".

The third approach uses a loop to get around using cellfun , since cellfun is often slower than an explicit loop

The results of a quick speed test are:

Elapsed time is 3.882972 seconds. %# First approach (a, isnan, and cellfun, eg @s.bandara)
Elapsed time is 3.391190 seconds. %# Second approach (a, b, and cellfun)
Elapsed time is 3.041992 seconds. %# Third approach (loop-based solution)

In other words, there are (small) savings to be made by passing b in rather than using isnan . And there are further (small) savings to be made by using a loop rather than cellfun . But I wouldn't lose sleep over it. Remember, the results of any simulation are specific to the specified inputs.

Note, these results were consistent across several runs, I used tic and toc to do this, albeit with many loops over each method. If I wanted to be really thorough, I should use timeit from FEX. If anyone is interested, the code for the three methods follows:

%# Build some example matrices
T = 1000; N = 100; Q = 50; M = 100;
a = cell(1, Q); b = cell(1, Q);
for q = 1:Q
    a{q} = randn(T, N);
    b{q} = logical(randi(2, T, N) - 1);
    a{q}(b{q}) = nan;
end

%# Solution using a, isnan, and cellfun (@s.bandara solution)
tic
for m = 1:M
    Soln2 = cellfun(@f1, a, 'UniformOutput', 0);
end
toc

%# Solution using a, b, and cellfun
tic
for m = 1:M
    Soln1 = cellfun(@f2, a, b, 'UniformOutput', 0);
end
toc


%# Solution using a loop to avoid cellfun
tic
for m = 1:M
    Soln3 = cell(1, Q);
    for q = 1:Q
        Soln3{q} = a{q};
        Soln3{q}(b{q}) = 0;
    end
end
toc

%# Solution proposed by @EitanT
[K, N] = size(a{1});
tic
for m = 1:M
    a0 = [a{:}];       %// Concatenate matrices along the 2nd dimension
    a0(isnan(a0)) = 0; %// Replace NaNs with zeroes    
    Soln4 = mat2cell(a0, K, N * ones(size(a)));
end
toc

where:

function x1 = f1(x1)
x1(isnan(x1)) = 0;

and:

function x1 = f2(x1, x2)
x1(x2) = 0;

UPDATE: A fourth approach has been suggested by @EitanT. This approach concatenates the cell array of matrices into one large matrix, performs the operation on the large matrix, then optionally converts it back to a cell array. I have added the code for this procedure to my testing routine above. For the inputs specified in my testing code, ie T = 1000 , N = 100 , Q = 50 , and M = 100 , the timed run is as follows:

Elapsed time is 3.916690 seconds. %# @s.bandara
Elapsed time is 3.362319 seconds. %# a, b, and cellfun
Elapsed time is 2.906029 seconds. %# loop-based solution
Elapsed time is 4.986837 seconds. %# @EitanT

I was somewhat surprised by this as I thought the approach of @EitanT would yield the best results. On paper, it seems extremely sensible. Note, we can of course mess around with the input parameters to find specific settings that advantage different solutions. For example, if the matrices are small, but the number of them is large, then the approach of @EitanT does well, eg T = 10 , N = 5 , Q = 500 , and M = 100 yields:

Elapsed time is 0.362377 seconds. %# @s.bandara
Elapsed time is 0.299595 seconds. %# a, b, and cellfun
Elapsed time is 0.352112 seconds. %# loop-based solution
Elapsed time is 0.030150 seconds. %# @EitanT

Here the approach of @EitanT dominates.

For the scale of the problem indicated by the OP, I found that the loop based solution usually had the best performance. However, for some Q , eg Q = 5 , the solution of @EitanT managed to edge ahead.

Hmm.

Given the nature of the contents of your cell array, there may exist an even faster solution: you can convert your cell data to a single matrix and use vector indexing to replace all NaN values in it at once, without the need of cellfun or loops:

a0 = [a{:}];       %// Concatenate matrices along the 2nd dimension
a0(isnan(a0)) = 0; %// Replace NaNs with zeroes

If you want to convert it back to a cell array, that's fine:

[M, N] = size(a{1});
mat2cell(a0, M, N * ones(size(a)))

PS
Work with a 3-D matrix instead of a cell array, if possible. Vectorized operations are usually much faster in MATLAB.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM