比較兩個字符串數組

Question

我在表中有兩個字符串列表作為一列（ PM25_spr{i}.MonitorID和O3_spr{i}.MonitorID ）。 列表的長度不同。 我想比較每個條目的前11個字符，並為每個相同的列表提取索引。

例

List 1:
    '01-003-0010-44201'
    '01-027-0001-44201'
    '01-051-0001-44201'
    '01-073-0023-44201'
    '01-073-1003-44201'
    '01-073-1005-44201'
    '01-073-1009-44201'
    '01-073-1010-44201'
    '01-073-2006-44201'
    '01-073-5002-44201'
    '01-073-5003-44201'
    '01-073-6002-44201'

List 2:
    '01-073-0023-88101'
    '01-073-2003-88101'
    '04-013-0019-88101'
    '04-013-9992-88101'
    '04-013-9997-88101'
    '05-119-0007-88101'
    '05-119-1008-88101'
    '06-019-0008-88101'
    '06-029-0014-88101'
    '06-037-0002-88101'
    '06-037-1103-88101'
    '06-037-4002-88101'
    '06-059-0001-88101'
    '06-065-8001-88101'
    '06-067-0010-88101'
    '06-073-0003-88101'
    '06-073-1002-88101'
    '06-073-1007-88101'
    '08-001-0006-88101'
    '08-031-0002-88101'

我嘗試過intersect ，這不是我想要做的正確方法。 鑒於我只想查看前11個字符，因此我不確定如何使用ismember 。

我嘗試了strncmp ，但是Inputs must be the same size or either one can be a scalar.

chars2compare = length('18-097-0083'); 
strncmp(O3_spr{i}.MonitorID, PM25_spr{i}.MonitorID,chars2compare)

Answer 1

PM25_spr_MID = cell(length(years),1); % Preallocate cell array
for n = 1:length(PM25_spr{i}.MonitorID) 
    s = char(PM25_spr{i}.MonitorID(n)); % Convert string to char
    PM25_spr_MID{i}(n) = cellstr(s(1:11)); % Pull out 1-11 characters and convert to cell
end

O3_spr_MID = cell(length(years),1); % Preallocate cell array
for n = 1:length(O3_spr{i}.MonitorID)
    s = char(O3_spr{i}.MonitorID(n));
    O3_spr_MID{i}(n) = cellstr(s(1:11));
end

[C, ia, ib] = intersect(O3_spr_MID{i}, PM25_spr_MID{i}) 
PerCap_spr_O3{i} = O3_spr{i}(ia,:);
PerCap_spr_PM25{i} = PM25_spr{i}(ib,:);

Answer 2

假設list1和list2是兩個輸入單元格數組，則可以使用幾種方法。

I.在單元陣列上操作

與intersect -

%// Clip off after first 11 characters in each cell of the input cell arrays
list1_f11 = arrayfun(@(n) list1{n}(1:11),1:numel(list1),'uni',0)
list2_f11 = arrayfun(@(n) list2{n}(1:11),1:numel(list2),'uni',0)

%// Use intersect to find common indices in the input cell arrays
[~,idx_list1,idx_list2] = intersect(list1_f11,list2_f11)

與ismember

%// Clip off after first 11 characters in each cell of the input cell arrays
list1_f11 = arrayfun(@(n) list1{n}(1:11),1:numel(list1),'uni',0)
list2_f11 = arrayfun(@(n) list2{n}(1:11),1:numel(list2),'uni',0)

%// Use ismember to find common indices in the input cell arrays
[LocA,LocB] = ismember(list1_f11,list2_f11);
idx_list1 = find(LocA)
idx_list2 = LocB(LocA)

二。在char數組上操作

我們可以在輸入單元格數組上直接使用char來獲取2D char數組，因為使用它們比使用cells更快。

與intersect +“行” -

%// Convert to char arrays
list1c = char(list1)
list2c = char(list2)

%// Clip char arrays after first 11 columns
list1c_f11 = list1c(:,1:11)
list2c_f11 = list2c(:,1:11)

%// Use intersect with 'rows' option
[~,idx_list1,idx_list2] = intersect(list1c_f11,list2c_f11,'rows')

三，在數字數組上操作

我們可以只用一列將char數組進一步轉換為數字數組，因為這可能導致更快的解決方案。

%// Convert to char arrays
list1c = char(list1)
list2c = char(list2)

%// Clip char arrays after first 11 columns
list1c_f11 = list1c(:,1:11)
list2c_f11 = list2c(:,1:11)

%// Remove char columns of hyphens (3 and 7 for the given input)
list1c_f11(:,[3 7])=[];
list2c_f11(:,[3 7])=[];

%// Convert char arrays to numeric arrays
ncols = size(list1c_f11,2);
list1c_f11num = (list1c_f11 - '0')*(10.^(ncols-1:-1:0))'
list2c_f11num = (list2c_f11 - '0')*(10.^(ncols-1:-1:0))'

從現在開始，接下來列出了三種其他的使用方法。

使用ismember （會提高內存效率，但可能無法在所有數據大小上快速ismember ） -

[LocA,LocB] = ismember(list1c_f11num,list2c_f11num);
idx_list1 = find(LocA)
idx_list2 = LocB(LocA)

與intersect （可能很慢） -

[~,idx_list1,idx_list2] = intersect(list1c_f11num,list2c_f11num)

使用bsxfun （會降低內存效率，但對於小到體面的輸入可能會很快） -

[idx_list1,idx_list2] = find(bsxfun(@eq,list1c_f11num,list2c_f11num'))

比較兩個字符串數組

問題描述

2 個解決方案

解決方案1
1 已采納 2015-01-30 19:29:55

解決方案2
1 2015-01-30 19:39:50

I.在單元陣列上操作

二。在char數組上操作

三，在數字數組上操作

比較兩個字符串數組

問題描述

2 個解決方案

解決方案1 1 已采納 2015-01-30 19:29:55

解決方案2 1 2015-01-30 19:39:50

I.在單元陣列上操作

二。 在char數組上操作

三， 在數字數組上操作

解決方案1
1 已采納 2015-01-30 19:29:55

解決方案2
1 2015-01-30 19:39:50

二。在char數組上操作

三，在數字數組上操作