简体   繁体   English

将带有文本限定符的CSV加载到MATLAB / Octave中

[英]Load CSV with text qualifier into MATLAB/Octave

Data 数据

Assume the following data format (with a header line in the first row, 500+ rows): 假设以下数据格式(第一行中有标题行,500多行):

1, "<LastName> ,<Title>. <FirstName>", <Gender>, 99.9

My Code 我的守则

I've tried this (IGNORE: see edit below) : 我试过这个(IGNORE:见下面的编辑)

[flag, name, gender, age] = textread('file.csv', '%d %q %s %f', 'headerlines', 1);

The Error 错误

...and get the following error message ...并收到以下错误消息

error: textread: A(I): index out of bounds; value 1 out of bound 0
error: called from: 
error:   C:\Program Files\Octave\Octave3.6.2_gcc4.6.2\share\octave\3.6.2\m\io\textread.m at line 75, column 3

Questons: Questons:

  • Is my format string incorrect given the text qualifier (and the comma embedded in the "name" string)? 给定文本限定符(以及嵌入在“name”字符串中的逗号),我的格式字符串是否不正确?
  • Am I even using the correct method of loading a CSV into MATLAB\\Octave? 我甚至使用正确的方法将CSV加载到MATLAB \\ Octave中?

EDIT 编辑

I forgot the delimiter (error message returns failure on different line in strread.m): 我忘记了分隔符(错误消息在strread.m中的不同行返回失败):

[flag, name, gender, age] = textread('file.csv', '%d %q %s %f', 'headerlines', 1, 'delimiter', ',');

I went with this, it however splits the text qualified string for the name field into two separate fields, so any text qualified fields that contain the field delimiter in the string will create an extra output column (I'm still interested to know why the %q format didn't work for this field -> whitespace perhaps?): 我使用它,然而它将名称字段的文本限定字符串拆分为两个单独的字段,因此字符串中包含字段分隔符的任何文本限定字段将创建一个额外的输出列(我仍然有兴趣知道为什么%q格式不适用于此字段 - >空格也许?):

% Begin CSV Import ============================================================================

    % strrep is used to strip the text qualifier out of each row. This is wrapped around the
    % call to textread, which brings the comma delimited data in row-by-row, and skips the 1st row,
    % which holds column field names.
    tic;
    data = strrep(
                    textread(
                                'file.csv'          % File name within current working directory
                                ,'%s'               % Each row is a single string
                                ,'delimiter', '\n'  % Each new row is delimited by the newline character
                                ,'headerlines', 1   % Skip importing the first n rows
                            )
                    ,'"'
                    ,''
                );

    for i = 1:length(data)
        delimpos = findstr(data{i}, ",");

        start = 1;
        for j = 1:length(delimpos) + 1,

            if j < length(delimpos) + 1,
                csvfile{i,j} = data{i}(start:delimpos(j) - 1);
                start = delimpos(j) + 1;
            else
                csvfile{i,j} = data{i}(start:end);
            end

        end
    end

    % Return summary information to user
    printf('\nCSV load completed in -> %f seconds\nm rows returned = %d\nn columns = %d\n', toc, size(csvfile)(1), size(csvfile)(2));

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM