簡體   English   中英

如何使索引功能更快?

[英]How can I make my index function faster?

因此,給定一個結構數組,即帶有字段,Word,Documents,Locations的Index,它需要一個char數組的單元格數組並將其索引到Index中,並且還記錄出現在其中的文檔的DocNums。

function Index = InsertDoc(Index, newDoc, DocNum)
    for i = 1:numel(newDoc) 
        contains = any(strcmpi(newDoc(i),[Index.Word]));
        if any(contains);
            curr = find(strcmpi(newDoc(i),[Index.Word]),true);
            Index(curr).Documents{1} = unique([Index(curr).Documents{1},DocNum]);
            if (numel(Index(curr).Documents{1}) ~= numel(Index(curr).Locations))
                Index(curr).Locations{end+1} = [i];
            else
                Index(curr).Locations{end} = [Index(curr).Locations{end},i];
            end

        else 
            curr = numel(Index) + 1;
            Index(curr).Word = [newDoc(i)];
            Index(curr).Documents = {DocNum};
            Index(curr).Locations = {[i]};
        end
    end
end

例如

Doc1 = {'Matlab', 'is', 'awesome'};
Doc2 = {'Programming', 'is', 'very', 'very', 'fun'};
Doc3 = {'I', 'love', 'Matlab','very','much};

someIndex = InitializeIndex;
% InitializeIndex just creates struct array with the given fields and empty cell arrays
someIndex = InsertDoc(someIndex, Doc1, 1);
someIndex = InsertDoc(someIndex, Doc2, 2);
someIndex = InsertDoc(someIndex, Doc3, 3);

結果將是someIndex(1)

Word: 'Matlab'
Documents: [1 3]
Locations: {[1] [3]}

someIndex(2)

Word: 'is'
Documents: [1 2]
Locations: {[2] [2]}

someIndex(5)

Word: 'very'
Documents: [2 3]
Locations: {[3 4] [4]}

我需要能夠使用具有多個單詞的20000個元素的結構數組來運行此命令,現在完成索引需要花費大量時間。 如何改善此算法?

在循環開始之前,嘗試為單元數組“ Index分配內存

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM