简体   繁体   中英

How to import data from file into variable arrays? (Matlab)

I've used matlab to create .txt files that have varying 3 columns separated by tabs (string, float, float) and varying number of rows.

I am trying to read each of those 3 columns of data into 3 different variables. Here is my code:

fileId = fopen('file.txt');

% Storing columns from txt file into appropriate compartment data arrays
compartment_name = textscan(fileId,'%s%*f%*f','Delimiter','\t'); % column of strings
compartment_length = textscan(fileId,'%*s%f%*f','Delimiter','\t'); % column of doubles
compartment_diameter = textscan(fileId,'%*s%*f%f','Delimiter','\t'); % column of doubles

fclose('file.txt');

I receive the correct data for compartment_name (1x1 cell containing 106x1 cells (each of which are a string)), however both compartment_length and compartment_diameter return an empty 1x1 cell that contains a 0x1 double.

Any thoughts?

Also - is there any easy way for me to convert the 1x1 cells into an array? ie for compartment_name, it would be an array of 1x106 strings ?

As @jgrant noted in a comment , the problem is that you have to reset the file position indicator to the beginning of the file if you want to re-read parts of your file.

I can't really see why you're trying to call textscan three times, the reason that the output from textscan is a cell is exactly that you can do a single call to it, then separate the output columns:

tmpcell = textscan(fileId,'%s%f%f','Delimiter','\t'); % column of strings

compartment_name = tmpcell{1};
compartment_length = tmpcell{2};
compartment_diameter = tmpcell{3};

% or if you want to be fancy about it:
%[compartment_name, compartment_length, compartment_diameter] = tmpcell{:};

The reason I'm writing up this answer is your final note:

Also - is there any easy way for me to convert the 1x1 cells into an array? ie for compartment_name, it would be an array of 1x106 strings ?

This hints at a confusion of yours regarding strings in MATLAB. In MATLAB, strings are essentially arrays of integers . You can see this for yourself by performing any kind of arithmetic operation on a string:

>> tmpstring = 'asdf'

tmpstring =

asdf

>> tmpstring*1

ans =

    97   115   100   102

The numbers you see are the ASCII representations of the characters in the string. This works the other way around too: you can build up a string by putting integers into an array. As a matter of fact, for all intents and purposes strings are integer arrays:

>> isequal([97   115   100   102],'asdf')

ans =

     1

This also implies several limitations for strings in MATLAB. What concerns your question is that you can't simply create an array of strings. That would exactly be string concatenation: if both string1 and string2 are merely integer arrays, then [string1, string2] is the concatenation of the two strings.

You can then think of stacking strings horizontally, using [string1; string2] [string1; string2] . Now, this works exactly as much as it would work for two integer arrays: you can only do this if the strings have equal length (by length I now mean size(string1,2) ). So in the general case, you can only store strings together in an inhomogeneous container, ie cells in MATLAB. Once you have cells, your elements can have any type and shape, so you can easily shove in strings of arbitrary length together, stacked vertically or horizontally, however you like them.

So consider textscan . You need to implement this function that will return data read from a file. The data can be both numeric or strings. What do you do? Exactly what textscan is doing: return numeric columns as arrays (since each row has a single scalar data), and return strings as cells (since each row contains a string, ie a vector in itself!). You could stack the strings horizontally, but this only works if each row in the given column contains the same number of characters, which should obviously not be assumed, nor prescribed. You could still pad the strings to the longest element and return a stacked character array, but this would introduce unnecessary overhead in most practical applications. (Side note: textscan returns a cell row vector as its output, with each cell element containing the full data in the given column. For numeric columns this "full data" is an array column vector, and for string columns it is a cell column vector.)

So it's reasonable that textscan returns its string columns as cells. You yourself can still stack your strings into a 2d string array if you wish to, but in most cases this is not really practical. It really depends on your application.

A minimal example: consider that tmp.inp contains

asf 3 4
asdg 2 3
asd 1 4

Now

>> fid=fopen('tmp.inp','r'); outcell=textscan(fid,'%s%f%f'), fclose(fid);

outcell = 

    {3x1 cell}    [3x1 double]    [3x1 double]

This demonstrates the fact that the output of outcell is a cell row vector, each element corresponding to a column read in from the file. The square brackets around column 2 and 3 indicate that those cell elements (namely outcell{2} and outcell{3} , not to be confused with outcell(2) and outcell(3) ) are numeric arrays. The first element, however, is a cell column vector:

>> outcell{1}

ans = 

    'asf'
    'asdg'
    'asd'

The fact that the output is printed with quotation marks on each line indicates that these are separate strings contained in a cell, but you can also tell this from

>> whos ans
  Name      Size            Bytes  Class    Attributes

  ans       3x1               356  cell               

Now, as I said, you can decide to stack your columns on top of each other, you only need to call char() on your cell:

>> char(outcell{1})

ans =

asf 
asdg
asd 

>> whos ans
  Name      Size            Bytes  Class    Attributes

  ans       3x4                24  char               

Note the lack of quotation marks in the automatic output, and the class/size of the output itself. The 3x4 size was made possible by padding all the rows to the size of the longest string, ie 4. Consequently the first and third row of the output ends with a space (this is what we mean by the strings getting padded ).

If you don't perform this padding, you can simply reference your strings read in as the cell elements they are:

>> outcell{1}{3}

ans =

asd

Or, by storing the variable as you wanted to originally:

>> compartment_name=outcell{1}

compartment_name = 

    'asf'
    'asdg'
    'asd'

>> compartment_name{3}

ans =

asd

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM