简体   繁体   English

Matlab,从csv文件中读取多个2d数组

[英]Matlab, read multiple 2d arrays from a csv file

I have a csv file which contains 2d arrays of 4 columns but a varying number of rows. 我有一个csv文件,其中包含4列但有不同行数的2d数组。 Eg: 例如:

2, 354, 23, 101
3, 1023, 43, 454
1, 5463, 45, 7657

4, 543, 543, 654
3, 56, 7654, 344

...

I need to be able to import the data such that I can run operations on each block of data, however csvread , dlmread and textscan all ignore the blank lines. 我需要能够导入数据,以便我可以对每个数据块运行操作,但是csvreaddlmreadtextscan都会忽略空白行。

I can't seem to find a solution anywhere, how can this be done? 我似乎无法在任何地方找到解决方案,如何做到这一点?

PS: PS:

It may be worth noting that the files of the format above are actually the concatenation of many files containing only one block of data (I don't want to have to read from thousands of files every time) therefore the blank line between blocks can be changed to any other delimiter / marker. 值得注意的是,上述格式的文件实际上是包含一个数据块的许多文件的串联(我不想每次都读取数千个文件)因此块之间的空白行可以是更改为任何其他分隔符/标记。 This is just done with a python script. 这只是用python脚本完成的。

EDIT: My Solution - based upon / inspired by petrichor below 编辑:我的解决方案 - 基于/激励下面的petrichor

I replaced the csvread with textscan which is faster. 我更换了csvreadtextscan这是更快。 Then I realised that if I replaced the blank lines with lines of nan instead (modifying my python script) I could remove the need for a second textscan the slow point. 然后我意识到,如果我用纳米线替换空行(修改我的python脚本),我可以删除第二个文本的需要可以慢点。 My code is: 我的代码是:

filename = 'data.csv';
fid = fopen(filename);
allData = cell2mat(textscan(fid,'%f %f %f %f','delimiter',','));
fclose(fid);

nanLines = find(isnan(allData(:,1)))';

iEnd = (nanLines - (1:length(nanLines)));
iStart = [1 (nanLines(1:end-1) - (0:length(nanLines)-2))];
nRows = iEnd - iStart + 1;

allData(nanLines,:)=[];

data = mat2cell(allData, nRows);

Which evaluates in 0.28s (a file of just of 103000 lines). 其评估范围为0.28s(仅为103000行的文件)。 I've accepted petrichor's solution as it indeed best solves my initial problem. 我已经接受了petrichor的解决方案,因为它确实最能解决我最初的问题。

filename = 'data.txt';

%# Read all the data
allData = csvread(filename);

%# Compute the empty line indices
fid = fopen(filename);
lines = textscan(fid, '%s', 'Delimiter', '\n');
fclose(fid);
blankLines = find(cellfun('isempty', lines{1}))';

%# Find the indices to separate data into cells from the whole matrix
iEnd = [blankLines - (1:length(blankLines)) size(allData,1)];
iStart = [1 (blankLines - (0:length(blankLines)-1))];
nRows = iEnd - iStart + 1;

%# Put the data into cells
data = mat2cell(allData, nRows)

That gives the following for your data: 这为您的数据提供了以下内容:

data = 

    [3x4 double]
    [2x4 double]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM