简体   繁体   中英

Headers in dataset (Matlab)

I can't find any good documentation about dataset(), so that's why I want to ask you guys, I'll keep the question short:

Can I set headers (column titles) in a dataset, without entering data into the dataset yet? I guess not, so the 2nd part of the question would be:
Can I make a one-row dataset, in which I name the headers, with empty data, and overwrite it later?

Let me show you what I was trying, but did not work:

dmsdb = dataset({ 'John','Name'},{'Amsterdam','City'},{10,'number' });  
produces:  
    Name    City         number  
    John    Amsterdam    10 --> Headers are good!  

Problem is, that when I am going to add more data to the dataset, it expects all strings to be of the same length. So I use cellstr():

dmsdb(1,1:3) = dataset({ cellstr('John'),'Name'},{cellstr('Amsterdam'),'City'},{10,'number' });  
Produces:  
    Var1          Var2               Var3  
    'John'        'Amsterdam'        10  

Where did my headers go? How do I solve this issue, and what is causing this?

You can set up an empty dataset like either

data = dataset({[], 'Name'}, {[], 'City'}, {[], 'number'});

or

data = dataset([], [], [], 'VarNames', {'Name', 'City', 'number'});

Both will give you:

>> data

data = 

[empty 0-by-3 dataset]

But we can see that the column names are set by checking

>> get(data, 'VarNames')                                             

ans = 

    'Name'    'City'    'number'

Now we can add rows to the dataset:

>> data = [data; dataset({'John'}, {'Amsterdam'}, 10, 'VarNames', get(data, 'VarNames'))]

data = 

    Name          City               number
    'John'        'Amsterdam'        10    

You had the basic idea, but just needed to put your string data in cells. This replacement for your first line works:

>> dmsdb = dataset({ {'John'},'Name'},{{'Amsterdam'},'City'},{10,'number' }); 

dmsdb = 

    Name          City               number
    'John'        'Amsterdam'        10    

The built-in help for dataset() is actually really good at laying out the details of these and other ways of constructing datasets. Also check out the online documentation with examples at:

http://www.mathworks.com/help/toolbox/stats/dataset.html

One of the Mathworks blogs has a nice post too:

http://blogs.mathworks.com/loren/2009/05/20/from-struct-to-dataset/

Good luck!

Here is an example:

%# create dataset with no rows
ds = dataset(cell(0,1),cell(0,1),zeros(0,1));
ds.Properties.VarNames = {'Name', 'City', 'number'};

%# adding one row at a time
for i=1:3
    row = {{'John'}, {'Amsterdam'}, 10};  %# construct new row each iteration
    ds(i,:) = dataset(row{:});
end

%# adding a batch of rows all at once
rows = {{'Bob';'Alice'}, {'Paris';'Boston'}, [20;30]};
ds(4:5,:) = dataset(rows{:});

The dataset at the end looks like:

>> ds
ds = 
    Name           City               number
    'John'         'Amsterdam'        10    
    'John'         'Amsterdam'        10    
    'John'         'Amsterdam'        10    
    'Bob'          'Paris'            20    
    'Alice'        'Boston'           30    

Note: if you want to use concatenation instead of indexing, you have to specify the variable names:

vars = {'Name', 'City', 'number'};
ds = [ds ; dataset(rows{:}, 'VarNames',vars)]

I agree, the help for dataset is hard to understand, mainly because there are so many ways to create a dataset and most methods involve a lot of cell arrays. Here are my two favorite ways to do it:

% 1) Create the 3 variables of interest, then make the dataset.  
% Make sure they are column vectors!
>> Name = {'John' 'Joe'}';  City = {'Amsterdam' 'NYC'}'; number = [10 1]';
>> dataset(Name, City, number)

ans = 

    Name          City               number
    'John'        'Amsterdam'        10    
    'Joe'         'NYC'               1    

% 2) More compact than doing 3 separate cell arrays
>> dataset({{'John' 'Amsterdam' 10} 'Name' 'City' 'number'})

ans = 

    Name          City               number  
    'John'        'Amsterdam'        [10]    

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM