简体   繁体   English

R将data.frame转换为每日时间序列对象

[英]R convert a data.frame to a daily time series object

I have my data.frame some what like this 我有我的data.frame这样的东西

  name units_sold order_date
1 obj1         10 2013-09-21
2 obj1         10 2013-09-22
3 obj1         10 2013-09-23
4 obj2        100 2013-09-21
5 obj2        200 2013-09-22
6 obj2        300 2013-09-23
7 obj3         70 2013-09-21
8 obj3        200 2013-09-22
9 obj3         50 2013-09-23

I want to convert it to a time series object such that it should have values in below format: 我想将其转换为时间序列对象,使其具有以下格式的值:

       2013-09-21  2013-09-22 2013-09-23
obj1      10            10         10
obj2      100           200        300
obj3      70            200        50

... for a week ... 一个星期

In R a multivariate series is normally represented by one series per column, not row. 在R中,多元系列通常由每列而不是行一个序列表示。 Using the zoo package one can read it in like this (to keep the example self contained we have read it in as a character string but you would want to replace text=Lines with something like file="myfile.dat" ): 使用zoo包可以像这样读取它(为使示例自成一体,我们以字符串形式读取它,但是您想用诸如file="myfile.dat"类的东西替换text=Lines ):

Lines <- "name units_sold order_date
1 obj1         10 2013-09-21
2 obj1         10 2013-09-22
3 obj1         10 2013-09-23
4 obj2        100 2013-09-21
5 obj2        200 2013-09-22
6 obj2        300 2013-09-23
7 obj3         70 2013-09-21
8 obj3        200 2013-09-22
9 obj3         50 2013-09-23
"

library(zoo)
z <- read.zoo(text = Lines, header = TRUE, index = 3, split = 1)

which gives: 这使:

> z
           obj1 obj2 obj3
2013-09-21   10  100   70
2013-09-22   10  200  200
2013-09-23   10  300   50

From this point on you can plot it ( plot(z) ), convert it to a ts series ( as.ts(z) although daily time series are not normally used with ts ) and do many other operations. 从这一点开始,您可以对其进行绘制( plot(z) ),将其转换为ts序列(尽管ts通常不使用每日时间序列as.ts(z) )并执行许多其他操作。 See the 5 zoo vignettes (pdfs) and the zoo help pages at the same link. 请在同一链接上查看5个Zoo vignettes(pdfs)和Zoo帮助页面。

(Note that in this case header=TRUE is not actually necessary since it will figure out that the first line is a header by virtue of the fact that the remaining lines have one more field, ie they have row names whereas the first line does not.) (请注意,在这种情况下, header=TRUE实际上不是必需的,因为由于剩余的行还有一个字段,即它们具有行名,而第一行却没有行,因此它会指出第一行是标题)

I don't think that the expected output is a ts object. 我认为预期的输出不是ts对象。 I understand your question as a reshaping problem from the long to the wide format. 我理解您的问题是从长格式到宽格式的重塑问题。 Here 2 methods: 这里有两种方法:

Using dcast from reshape2 package: reshape2包使用dcast

library(reshape2)
dcast(dat,name~order_date,value.var="units_sold")

 name 2013-09-21 2013-09-22 2013-09-23
1 obj1         10         10         10
2 obj2        100        200        300
3 obj3         70        200         50

Using reshape from base package: 使用基础包装的reshape

reshape(dat,direction='wide',idvar='name',timevar='order_date')

 name units_sold.2013-09-21 units_sold.2013-09-22 units_sold.2013-09-23
1 obj1                    10                    10                    10
4 obj2                   100                   200                   300
7 obj3                    70                   200                    50

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM