[英]Load CSV with text qualifier into MATLAB/Octave
Assume the following data format (with a header line in the first row, 500+ rows): 假设以下数据格式(第一行中有标题行,500多行):
1, "<LastName> ,<Title>. <FirstName>", <Gender>, 99.9
I've tried this (IGNORE: see edit below) : 我试过这个(IGNORE:见下面的编辑) :
[flag, name, gender, age] = textread('file.csv', '%d %q %s %f', 'headerlines', 1);
...and get the following error message ...并收到以下错误消息
error: textread: A(I): index out of bounds; value 1 out of bound 0
error: called from:
error: C:\Program Files\Octave\Octave3.6.2_gcc4.6.2\share\octave\3.6.2\m\io\textread.m at line 75, column 3
I forgot the delimiter (error message returns failure on different line in strread.m): 我忘记了分隔符(错误消息在strread.m中的不同行返回失败):
[flag, name, gender, age] = textread('file.csv', '%d %q %s %f', 'headerlines', 1, 'delimiter', ',');
I went with this, it however splits the text qualified string for the name field into two separate fields, so any text qualified fields that contain the field delimiter in the string will create an extra output column (I'm still interested to know why the %q format didn't work for this field -> whitespace perhaps?): 我使用它,然而它将名称字段的文本限定字符串拆分为两个单独的字段,因此字符串中包含字段分隔符的任何文本限定字段将创建一个额外的输出列(我仍然有兴趣知道为什么%q格式不适用于此字段 - >空格也许?):
% Begin CSV Import ============================================================================
% strrep is used to strip the text qualifier out of each row. This is wrapped around the
% call to textread, which brings the comma delimited data in row-by-row, and skips the 1st row,
% which holds column field names.
tic;
data = strrep(
textread(
'file.csv' % File name within current working directory
,'%s' % Each row is a single string
,'delimiter', '\n' % Each new row is delimited by the newline character
,'headerlines', 1 % Skip importing the first n rows
)
,'"'
,''
);
for i = 1:length(data)
delimpos = findstr(data{i}, ",");
start = 1;
for j = 1:length(delimpos) + 1,
if j < length(delimpos) + 1,
csvfile{i,j} = data{i}(start:delimpos(j) - 1);
start = delimpos(j) + 1;
else
csvfile{i,j} = data{i}(start:end);
end
end
end
% Return summary information to user
printf('\nCSV load completed in -> %f seconds\nm rows returned = %d\nn columns = %d\n', toc, size(csvfile)(1), size(csvfile)(2));
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.