简体   繁体   English

数据分析Matlab / Excel

[英]Data analysis Matlab/Excel

I have a data clustering problem. 我有数据聚类问题。 I have a series of events labeled by timestamps and I am trying to count how many events is in each 15 (also 30) minute blocks. 我有一系列带有时间戳的事件,我试图计算每15(也就是30)分钟的块中有多少事件。 I am doing this in excel with pivot table . 我正在使用数据透视表excel中执行此操作。 I can manage the 15 minute block but the problem is I need that if there is an empty block I need zero in that block . 我可以管理15分钟的区块,但问题是我需要如果有一个空区块 ,则该区块需要零 Instead excel doesn't show that block at all. 相反,excel根本不显示该块。

So how do I make that empty blocks can appear? 那么如何使空块出现呢?

Related question I am using this blocks to create a vector in matlab and so far I haven't figured easy way to do that. 相关问题我正在使用此块在Matlab中创建矢量,到目前为止,我还没有找到简单的方法来做到这一点。 I am little bit struggling how to import easily results of pivot table from excel to matlab. 我有点挣扎如何轻松地将数据透视表的结果从excel导入到matlab。

Sample input: 输入样例:

30/11/12 12:42 AM
30/11/12 12:47 AM
30/11/12 12:56 AM
30/11/12 1:01 AM
30/11/12 1:52 AM
30/11/12 1:57 AM
30/11/12 2:38 AM
30/11/12 2:39 AM
30/11/12 6:00 AM
30/11/12 6:09 AM
30/11/12 6:16 AM
30/11/12 6:23 AM
30/11/12 6:31 AM

The pivot table will give 数据透视表将给出

12:30 1
12:45 2
1:00 1
1:45 2
2:30 2
6:00 2
6:15 2
6:30 1

The problem is that from this I want to create a vector(each 15 minutes for the whole day meaning 24*4 = 96 elements) that will have "1" if there was an event and "0" if there was no event. 问题是,我要从中创建一个向量(一天中每15分钟表示24 * 4 = 96个元素),如果有事件,则将为“ 1”,而如果没有事件,则将为“ 0”。

So output would look like from 00:00 to 6:30. 因此输出看起来像是从00:00到6:30。

Output: 输出:

 vector = (0,0,1,1;1,0,0,1;0,0,1,0;0,0,0,0;0,0,0,0;0,0,0,0;1,1,1) 

where semicolon devides each hour just to read it easier 分号每个小时都只是为了使阅读更容易

How to tackle this? 如何解决呢? Any hints? 有什么提示吗? Is this easier to tackle in Matlab but the timestamps there are not that easy as in excel. 这在Matlab中更容易解决,但是那里的时间戳并不像excel中那么容易。

I'm not sure how to fix your Excel problem. 我不确定如何解决Excel问题。 But here is how to do this in Matlab: 但是,这是在Matlab中执行此操作的方法:

%Data
dateStrings = {...
    '30/11/12 12:42 AM' ...
    '30/11/12 12:47 AM' ...
    '30/11/12 12:56 AM' ...
    '30/11/12 1:01 AM' ...
    '30/11/12 1:52 AM' ...
    '30/11/12 1:57 AM' ...
    '30/11/12 2:38 AM' ...
    '30/11/12 2:39 AM' ...
    '30/11/12 6:00 AM' ...
    '30/11/12 6:09 AM' ...
    '30/11/12 6:16 AM' ...
    '30/11/12 6:23 AM' ...
    '30/11/12 6:31 AM' ...
    };
%Convert data into datenums.  This is Matlab's standard numeric date encoding.
%    in units of days, starting at year 0000.
dateNumbers = datenum(dateStrings, 'dd/mm/yy HH:MM PM');

%Parametrically define the boundaries where you want to count
aggregationInterval = 1/24/4;  %15 minutes, in days\
aggregationStart = datenum('2012-11-30 00:00','yyyy-mm-dd HH:MM');
aggregationStop = datenum('2012-11-30 03:00','yyyy-mm-dd HH:MM');

%Use parameters to construct a vector of counting boundaries
aggregationBoundaries = aggregationStart:aggregationInterval:aggregationStop;

%The function histc does all the work, and returns a vector of counts
counts = histc(dateNumbers, aggregationBoundaries);

%This creates a cell to give you something to look at,  Instead of "disp" you coult use "xlswrite" to put this back into Excel.
disp([...
    cellstr(datestr(aggregationBoundaries','yyyy-mm-dd HH:MM')) ...
    num2cell(counts)])

This displays 显示

'2012-11-30 00:00'    [0]
'2012-11-30 00:15'    [0]
'2012-11-30 00:30'    [1]
'2012-11-30 00:45'    [2]
'2012-11-30 01:00'    [1]
'2012-11-30 01:15'    [0]
'2012-11-30 01:30'    [0]
'2012-11-30 01:45'    [2]
'2012-11-30 02:00'    [0]
'2012-11-30 02:15'    [0]
'2012-11-30 02:30'    [2]
'2012-11-30 02:45'    [0]
'2012-11-30 03:00'    [0]

If your dates are already in Excel, you can also look at xlsread to read the values into Matlab without any text formatting. 如果您的日期已经在Excel中,则还可以查看xlsread将值读入Matlab,而无需任何文本格式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM