简体   繁体   English

在MATLAB中随机选择单元格数组中的样本

[英]Randomly Select sample from a cell array in MATLAB

I have a cell array in MATLAB as follow, the first column is a list of user ID: 我在MATLAB有一个单元格数组如下,第一列是user ID列表:

A = { 'U2', 'T13', 'A52';  
      'U2', 'T15', 'A52';  
      'U2', 'T18', 'A52';  
      'U2', 'T17', 'A995'; 
      'U4', 'T18', 'A53';  
      'U4', 'T13', 'A64';  
      'U4', 'T18', 'A64';
      ....
     }

I also have a cell array B contains the unique ID for user as follow: 我还有一个单元格数组B包含user的唯一ID,如下所示:

B = {'U2', 'U4'}

My goal is try to randomly select two samples for each user . 我的目标是尝试为每个user随机选择两个样本。 Assume each user at least have two samples in B . 假设每个user至少在B有两个样本。

One example is the C as follow: 一个例子是C如下:

C = { 'U2', 'T13', 'A52';  
      'U2', 'T18', 'A52';   
      'U4', 'T13', 'A64';  
      'U4', 'T18', 'A64';
        ...
     }

How to generate those sample? 如何生成这些样本?

A = { 'U2', 'T13', 'A52';  
      'U2', 'T15', 'A52';  
      'U2', 'T18', 'A52';  
      'U2', 'T17', 'A995'; 
      'U4', 'T18', 'A53';  
      'U4', 'T13', 'A64';  
      'U4', 'T18', 'A64'
     };
B = {'U2', 'U4'};


userRep = [];
 for i = 1:size(A,1)
     for j = 1:size(B,2)
        if A{i,1} == B{j}
            userRep(end+1,:) = [j,i];
        end
     end
 end


 numberOfSamp = 2;
 samples = {};
 for i = 1:size(B,2)
     cellPos = userRep(userRep(:,1) == i,:);
     cellPos = cellPos(randi([1 size(cellPos,1)],[min(numberOfSamp,size(cellPos,1)),1]),:);
     for j = 1:size(cellPos,1)
        samples{end+1,1} = A{cellPos(j,2),1};
        samples{end,2} = A{cellPos(j,2),2};
        samples{end,3} = A{cellPos(j,2),3};
     end
end

samples

The following code should produce what you are looking for: 以下代码应该产生您正在寻找的内容:

A = {
  'U2', 'T13', 'A52';  
  'U2', 'T15', 'A52';  
  'U2', 'T18', 'A52';  
  'U2', 'T17', 'A995'; 
  'U4', 'T18', 'A53';  
  'U4', 'T13', 'A64';  
  'U4', 'T18', 'A64';
  'U7', 'T14', 'A44';  
  'U7', 'T14', 'A27';  
  'U7', 'T18', 'A27';  
  'U7', 'T13', 'A341';  
  'U7', 'T11', 'A111';
  'U8', 'T17', 'A39';  
  'U8', 'T15', 'A58'
};

% Find the unique user identifiers...
B = unique(A(:,1));
B_len = numel(B);

% Preallocate a cell array to store the results...
R = cell(B_len*2,size(A,2));
R_off = 1;

% Iterate over the unique user identifiers...
for i = 1:B_len

    % Pick all the entries of A belonging to the current user identifier...
    D = A(ismember(A(:,1),B(i)),:);

    % Pick two random non-repeating entries and add them to the results...
    idx = datasample(1:size(D,1),2,'Replace',false);
    R([R_off (R_off+1)],:) = D(idx,:); 

    % Properly increase the offset to the results array...
    R_off = R_off + 2;

end

Here is one of the possible outcomes for the code snippet above: 以下是上述代码段的可能结果之一:

>> disp(R)

    'U2'    'T13'    'A52' 
    'U2'    'T18'    'A52' 
    'U4'    'T13'    'A64' 
    'U4'    'T18'    'A64' 
    'U7'    'T14'    'A44' 
    'U7'    'T13'    'A341'
    'U8'    'T17'    'A39' 
    'U8'    'T15'    'A58' 

For more information about the functions I used, refer to the following pages of the official Matlab documentation: 有关我使用的函数的更多信息,请参阅Matlab官方文档的以下页面:

Let the input variables be defined as 将输入变量定义为

A = { 'U2', 'T13', 'A52';  
      'U2', 'T15', 'A52';  
      'U2', 'T18', 'A52';  
      'U2', 'T17', 'A995'; 
      'U4', 'T18', 'A53';  
      'U4', 'T13', 'A64';  
      'U4', 'T18', 'A64';
     };                     % data
B = {'U2', 'U4'};           % unique identifiers
n = 2;                      % number of results per group

You can achieve what you want as follows: 您可以按如下方式实现您想要的目标:

  1. Create a grouping variable, so each ID corresponds to an integer; 创建分组变量,因此每个ID对应一个整数;
  2. Pick n random values from the set of row indices corresponding to each group; 从与每个组对应的行索引集中选择n随机值;
  3. Use the set of all such indices to index into A . 使用所有这些索引的集合来索引到A

Code: 码:

[~, m] = ismember(A(:,1), B);                                  % step 1
s = accumarray(m, 1:size(A,1).', [], @(x){randsample(x, n)});  % step 2
C = A(vertcat(s{:}),:);                                        % step 3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM