简体   繁体   English

有什么方法可以加速必须在20000+行中搜索16x字符串的Matlab代码吗?

[英]Any way to speed up Matlab code that has to search through 20000+ rows for a string 16x?

Basically the way the code works is that the user types in a string (in my case a timestamp like 7:29:29 AM - 2:33:33 PM) and the code reads in a data file from excel that has those strings in it and all the data: filename = get(handles.File,'String'); 基本上,代码的工作方式是用户键入一个字符串(在我的情况下为7:29:29 AM-2:33:33 PM这样的时间戳),然后代码从excel中读取包含这些字符串的数据文件。它和所有数据:filename = get(handles.File,'String'); [Data,Text] = xlsread(filename,2); [数据,文本] = xlsread(filename,2);

IndexStart=strmatch(get(handles.StartTime,'String'),Text,'exact'); %start time
IndexEnd=strmatch(get(handles.EndTime,'String'),Text,'exact'); %end time

seconds = IndexEnd-IndexStart;
PlotData = Data([IndexStart: IndexEnd],:);

It then searches for the row number of that time stamp in Text and copies the corresponding data section from the data for this time range so that I can plot it. 然后,它在“文本”中搜索该时间戳记的行号,并从该时间范围的数据中复制相应的数据部分,以便进行绘制。 This data is collected for 8+ hours at 1 sample/sec so there are easily 30000 rows in the excel file to search through. 此数据以1个样本/秒的速度收集了8个多小时,因此excel文件中很容易搜索30000行。 This large chunk of data is going to be plotted with labels on the plot for different events (assuming they put an event for every box but i take that into account with the if statement). 大量数据将在图表上标出不同事件的标签(假设它们为每个框放了一个事件,但我将if语句考虑在内)。 The way I have this set up now is in a gui where the user places in timestamp values as strings and the code searches for them: 我现在进行此设置的方式是在gui中,用户将时间戳记值放置为字符串,然后代码搜索它们:

if isempty(get(handles.Task16End,'String')) 
IndexTextTask16End  = IndexStart;
else
IndexTextTask16End=strmatch(get(handles.Task16End,'String'),Text,'exact'); %row location for timestamp
end
Task16Span=IndexTextTask16End-IndexTextTask15End; %timespan of this event
Task16LineLocation=Task15LineLocation+ Task16Span/3600; %location for vertical line on graph

So i have up to 16 tasks that can be inputted which means that the program has to search through EVERY single available cell in a matlab matrix for these strings until it runs through the code. 因此,我最多可以输入16个任务,这意味着程序必须在matlab矩阵中的每个可用单元格中搜索这些字符串,直到它遍历代码为止。 How can I do this more efficiently? 我怎样才能更有效地做到这一点? maybe set it to search until it finds a truly empty cell? 也许将其设置为搜索,直到找到一个真正的空单元格? That would at least limit my search to the given data instead of the entire array possible. 这至少将我的搜索限制为给定数据,而不是整个可能的数组。

If you're reading an Excel spreadsheet for start/end times and doing all this other work in MATLAB, consider converting the times to to their datenum representation after your xlsread . 如果您正在读取Excel电子表格中的开始/结束时间并在MATLAB中完成所有其他工作,请考虑在datenum之后将时间转换为它们的datenum表示 xlsread This way you can compare numbers, not strings - much faster. 这样,您可以比较数字而不是字符串-更快。 Given this, you could use logical indexing to build your desired data: 鉴于此,您可以使用逻辑索引来构建所需的数据:

times   = datenum(Text); % assumed Text is just a cell array of times
t_start = datenum(get(handles.StartTime,'String'));
t_end   = datenum(get(handles.EndTime, 'String'));

plotData = Data(times >= t_start & times <= t_end); % note the single &, which is different than &&

strcmp is often much faster than strmatch ; strcmp通常比strmatch快很多; I tried it out and it was much faster on my system, don't know exactly why (like 1000x faster, I wasn't expecting that much difference). 我尝试了一下,它在我的系统上快得多,不知道为什么(就像快1000倍,我没想到会有那么大的差异)。

It returns slightly different information - strcmp returns a logical array with 1 wherever there's a match - so to get the same output as with strmatch just wrap it with a find : 它返回的信息稍有不同- strcmp返回与1逻辑阵列哪里有相匹配的-所以要得到相同的输出与strmatch只是一个包装它find

IndexStart=find(strcmp(get(handles.StartTime,'String'),Text)); %start time

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM