简体   繁体   English

在R中将data.frame转换为时间序列对象时遇到困难?

[英]Facing difficulty in convert a data.frame to time series object in R?

I am a novice in R language. 我是R语言的新手。 I am having text file separated by tab available with sales data for each day. 我的文本文件每天都用制表符分隔,并带有销售数据。 The format will be like product-id, day0, day1, day2, day3 and so on. 格式将类似于product-id,day0,day1,day2,day3等。 The part of the input file given below 输入文件的部分如下所示

productid   0   1   2   3   4   5   6
1           53  40  37  45  69  105 62
4           0   0   2   4   0   8   0
5           57  133 60  126 90  87  107
6           108 130 143 92  88  101 66
10          0   0   2   0   4   0   36
11          17  22  16  15  45  32  36

I used code below to read a file 我用下面的代码读取文件

pdInfo <- read.csv("products.txt",header = TRUE, sep="\t")

This allows to read the entire file and variable x is a data frame. 这样可以读取整个文件,并且变量x是数据帧。 I would like to change data.frame x to time series object in order for the further processing.On a stationary test, Dickey–Fuller test (ADF) it shows an error. 我想将data.frame x更改为时间序列对象以进行进一步处理。在固定测试中,Dickey-Fuller测试(ADF)显示错误。 I tried the below code 我尝试了以下代码

x <- ts(data.matrix(pdInfo),frequency = 1)
adf <- adf.test(x)

  error: Error in adf.test(x) : x is not a vector or univariate time series

Thanks in advance for the suggestions 预先感谢您的建议

In R, time series are usually in the form "one row per date", where your data is in the form "one column per date". 在R中,时间序列通常采用“每个日期一行”的形式,而您的数据则采用“每个日期一列”的形式。 You probably need to transpose the data before you convert to a ts object. 在转换为ts对象之前,可能需要转置数据。

First transpose it: 首先转置它:

y= t(pdInfo)

Then make the top row (being the product id's) into the row titles 然后将第一行(作为产品ID)放入行标题中

colnames(y) = y[1,]
y= y[-1,] # to drop the first row

This should work: 这应该工作:

x = ts(y, frequency = 1)
library(purrr)
library(dplyr)
library(tidyr)
library(tseries)

# create the data

df <- structure(list(productid = c(1L, 4L, 5L, 6L, 10L, 11L), 
                     X0 = c(53L, 0L, 57L, 108L, 0L, 17L), 
                     X1 = c(40L, 0L, 133L, 130L, 0L, 22L), 
                     X2 = c(37L, 2L, 60L, 143L, 2L, 16L), 
                     X3 = c(45L, 4L, 126L, 92L, 0L, 15L), 
                     X4 = c(69L, 0L, 90L, 88L, 4L, 45L), 
                     X5 = c(105L, 8L, 87L, 101L, 0L, 32L), 
                     X6 = c(62L, 0L, 107L, 66L, 36L, 36L)), 
                .Names = c("productid", "0", "1", "2", "3", "4", "5", "6"), 
                class = "data.frame", row.names = c(NA, -6L))

# apply adf.test to each productid and return p.value

adfTest <- df %>% gather(key = day, value = sales, -productid) %>%
  arrange(productid, day) %>%
  group_by(productid) %>%
  nest() %>%
  mutate(adf = data %>% map(., ~adf.test(as.ts(.$sales)))
  ,adf.p.value = adf %>% map_dbl(., "p.value")) %>%
  select(productid, adf.p.value) 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM