如何使用`textscan`将字符串转换为表格？

Question

I'm using matlab to read in COVID-19 data provided by Johns Hopkins as a .csv-file using urlread , but I'm not sure how to use textscan in the next step in order to convert the string into a table.我正在使用 matlab 使用urlread 将约翰霍普金斯大学提供的COVID-19 数据作为.csv 文件urlread ，但我不确定下一步如何使用textscan将字符串转换为表格。 The first two columns of the .csv-file are strings specifying the region, followed by a large number of columns containing the registered number of infections by date. .csv 文件的前两列是指定区域的字符串，后面是大量包含按日期登记的感染数的列。

Currently, I just save the string returned by urlread locally and open this file with importdata afterwards, but surely there should be a more elegant solution.目前，我只是将urlread返回的字符串保存在本地，然后用importdata打开这个文件，但肯定应该有一个更优雅的解决方案。

Answer 1

You have mixed-up two things: Either you want to read from the downloaded csv-file using ´textscan´ (and ´fopen´,´fclose´ of course), or you want to use ´urlread´ (or rather ´webread´ as MATLAB recommends not to use ´urlread´ anymore).你混淆了两件事：要么你想使用'textscan'（当然还有'fopen'，'fclose'）从下载的csv文件中读取，或者你想使用'urlread'（或者更确切地说是'webread'）因为 MATLAB 建议不要再使用“urlread”）。 I go with the latter, since I have never done this myself^^我选择后者，因为我自己从来没有这样做过^^

So, first we read in the data and split it into rows所以，首先我们读入数据并将其拆分为行

url = "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Confirmed.csv";
% read raw data as single character array
web = webread(url);
% split the array into a cell array representing each row of the table
row = strsplit(web,'\n');

Then we allocate a table (pre-allocation is good for MATLAB as it stores variables on consecutive addresses in the RAM, so tell MATLAB beforehand how much space you need):然后我们分配一个表（预分配对 MATLAB 有好处，因为它将变量存储在 RAM 中的连续地址上，所以事先告诉 MATLAB 您需要多少空间）：

len = length(row);
% get the CSV-header as information about the number of columns
Head = strsplit(row{1},',');
% allocate table 
S = strings(len,2);
N = NaN(len,length(Head)-2);
T = [table(strings(len,1),strings(len,1),'VariableNames',Head(1:2)),...
    repmat(table(NaN(len,1)),1,length(Head)-2)];
% rename columns of table
T.Properties.VariableNames = Head;

Note that I did a little trick to allocate so many reparate columns of ´NaN´s by repeating a single table.请注意，我做了一个小技巧，通过重复单个表来分配如此多的“NaN”列。 However, concatnating this table with the table of strings is difficult as both contain the column-names var1 and var2 .但是，将此表与字符串表连接起来很困难，因为两者都包含列名var1和var2 。 That is why I renamed the column of the first table right away.这就是我立即重命名第一个表的列的原因。

Now we can actually fill the table (which is a bit nasty due to the person who found it nice to write ´Korea, South´ into a comma -separated file)现在我们实际上可以填充表格（这有点令人讨厌，因为有人发现将“韩国，南”写入逗号分隔的文件很不错）

for i = 2:len
    % split this row into columns
    col = strsplit(row{i},',');
    % quick conversion
    num = str2double(col);

    % keep strings where the result is NaN
    lg = isnan(num);
    str = cellfun(@string,col(lg)); 
    T{i,1} = str(1);
    T{i,2} = strjoin(str(2:end));% this is a nasty workaround necessary due to "Korea, South"
    T{i,3:end} = num(~lg);
end

This should also work for the days that are about to come.这也应该适用于即将到来的日子。 Let me know what you actually gonna do with the data让我知道你实际上打算用数据做什么

如何使用`textscan`将字符串转换为表格？

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-03-19 17:53:24

如何使用`textscan`将字符串转换为表格？

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-03-19 17:53:24

解决方案1
1 已采纳 2020-03-19 17:53:24