简体   繁体   English

直接从Matlab中的url读取csv数据

[英]Read csv data directly from url in Matlab

I am having some problems reading data from a database in Matlab. 我在从Matlab数据库中读取数据时遇到一些问题。 The url link I use to download the data gives a semicolon delimited text file and I need Matlab to identify this data and arrange it accordingly in a struct format (since the data has different classes) for example. 我用来下载数据的url链接提供了一个以分号分隔的文本文件,例如,我需要Matlab来识别此数据,并以struct格式对其进行相应的排列(因为数据具有不同的类)。 I have already used urlread and could download the data succesfully, the only problem is I am getting all the data as a character string inside one cell and I need to get this data as a table and well organized. 我已经使用过urlread并可以成功下载数据,唯一的问题是我将所有数据作为一个字符串存储在一个单元格内,并且需要以表的形式组织这些数据。

Basically I would like to know if it is possible to load data from a url into Matlab the same way the read.csv function in R does, where you just put the url where the file name should go and define how the data is delimited and viola, you get your data.frame with all your data perfectly organized as it should. 基本上,我想知道是否可以像使用R中的read.csv函数一样,将网址中的数据加载到Matlab中,您只需将网址放在文件名应所在的位置,并定义如何分隔数据,中提琴,您可以将data.frame和所有数据完美地组织起来。

I suppose there are ways to interpret the character string after using urlread and convert it somehow into an organized struct variable but there has to be a way to read it directly from the url as R does. 我想有几种方法可以在使用urlread之后解释字符串并将其以某种方式转换为有组织的struct变量,但是必须有一种方法可以像R一样直接从url读取。

Here is a piece of code that will read the csv data from the web ( urlread ), use textscan to scan and format the data into cells (strings and scalars allowed), then convert the cells into a structure with cell2struct . 这是一段代码,将从网络( urlread )读取csv数据,使用textscan扫描数据并将其格式化为单元格(允许使用字符串和标量),然后将单元格转换为具有cell2struct的结构。 The structure created keeps the textscan formatting. 创建的结构保持textscan格式。

Note that you have to define the textscan format and the cell2struct input to suit your data. 请注意,您必须定义textscan格式和cell2struct输入以适合您的数据。

block = urlread('http://hci.stanford.edu/jheer/workshop/data/florida2000/Florida2000.csv');
C = textscan(block,'%s%s%f%s%f','HeaderLines',1,'EndOfLine','\n');
S = cell2struct(C,{'county','technology','columns','category','ballots'},2)

Here is the Florida 2000 Presidential Election Results ( .csv , 938 data points) 这是佛罗里达州2000年总统选举的结果( .csv ,938个数据点)

county,technology,columns,category,ballots
Alachua,Optical,1,under,217
Alachua,Optical,1,over,105
Alachua,Optical,1,Bush,34124
Alachua,Optical,1,Gore,47365
Alachua,Optical,1,Browne,658
Alachua,Optical,1,Nader,3226
Alachua,Optical,1,Harris,6
...

that will produce 那会产生

S = 

    county: {938x1 cell}    %string
technology: {938x1 cell}    %string
   columns: [938x1 double]  %double
  category: {938x1 cell}    %string
   ballots: [938x1 double]  %double

Edit 编辑

For double quoted text, you can use %q instead of %s in calling textscan ( FormatSpec options ) just like that 对于双引号文本,可以在调用textscanFormatSpec options )时使用%q代替%s ,就像这样

C = textscan(fileID,'%q%f');

Look into a function called dlmread . 查看一个名为dlmread的函数。 This will allow you to put in a string of data, tell it what the delimiter is, and it should pump out what you need. 这将允许您放入一串数据,告诉它分隔符是什么,并且它应该输出所需的数据。

dlmread dlmread

results = dlmread('http://someurl.com/somefile.txt',';')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM