简体   繁体   中英

Argument is not numeric or logical with function rollapply, followed by NAs introduced by coercion

I am trying to calculate means each 3 observations in my dataframe which is based on 10 minutes data and I am trying to average it down to half an hour. My data looks like this:

    Date             Value
2017-09-20 09:19:59 96.510
2017-09-20 09:30:00 113.290
2017-09-20 09:40:00 128.370
2017-09-20 09:50:00 128.620
2017-09-20 10:00:00 94.080
2017-09-20 10:10:00 208.150
2017-09-20 10:20:00 178.820
2017-09-20 10:30:00 208.440
2017-09-20 10:40:00 285.490
2017-09-20 10:49:59 305.020

I first tried calculating the means with the function rollapply from the zoo package library (zoo) in the following way:

means <- rollapply(df, by=3, 3, FUN=mean)

However, I got 50 warnings saying:

In mean.default(data[posns], ...) : argument is not numeric or logical: returning NA

I checked my classes and the value(numeric) and Date is a factor. Then I tried to convert the Date (factor) to a date class by:

`df$Date <- as.Date(df, format = "%Y-%m-%d %H:%m:%s")` and

df$Date <- strptime(time,"%Y-%m-%d %H:%M:%S",tz="GMT") and still didn't work.

I also tried to calculate the means with aggregate and it still doesn't work.

library(chron)
aggregate(chron(times=Date) ~ Value, data=df, FUN=mean)

and I got:

Error in convert.times(times., fmt) : format h:m:s may be incorrect In addition: Warning message: In convert.times(times., fmt) : NAs introduced by coercion

I am desperate at this pointand I am sorry for asking here. Maybe there is something wrong with my data since it was first an xlxs file and I converted the weird excel times into Dates in R but still... I am wondering since it is because some of the dates have the :59 seconds at the end. I can also post my entire data online if that's helpful. Many thanks!

The code in the question coerces df to a matrix which turns it into a character matrix and then it attempts to take a rolling mean of each of the two columns, both of which are character.

It's so much easier if you use a time series representation. Data frames are really not ideal for representing time series since you are constantly coordinating the time column and the data whereas if you represent it as a zoo object that will all automatically be handled.

First convert df to a zoo series, then run rollapplyr . Optionally convert it back to a data frame or just leave it as a zoo object.

library(zoo)

z <- read.zoo(df)
Value <- rollapplyr(z, 3, by = 3, mean)
# fortify.zoo(Value)

If you want to express this using pipes then try this:

library(magrittr)
library(zoo)

Value <- df %>% read.zoo %>% rollapplyr(3, by = 3, mean)

Note

The input df that was used, in reproducible form, is:

df <-
structure(list(Date = structure(c(1505913599, 1505914200, 1505914800, 
1505915400, 1505916000, 1505916600, 1505917200, 1505917800, 1505918400, 
1505918999), class = c("POSIXct", "POSIXt"), tzone = ""), Value = c(96.51, 
113.29, 128.37, 128.62, 94.08, 208.15, 178.82, 208.44, 285.49, 
305.02)), class = "data.frame", row.names = c(NA, -10L))

The main issue is that you are trying to use rollapply with a dataframe instead of a single column or a vector. If I understand your goal correctly, the following should do the job:

library(dplyr)
library(zoo)

df %>%
  # compute rolling means with a window width of 3
  mutate(means = rollmeanr(Value, k = 3, fill = NA)) %>%
  # decrease the frequency in accordance with the window width
  filter(seq_len(nrow(df)) %% 3 == 0) # or alternatively, slice(seq(3, nrow(df), 3))

# # A tibble: 3 x 3
#   Date                Value means
#   <dttm>              <dbl> <dbl>
# 1 2017-09-20 09:40:00  128.  113.
# 2 2017-09-20 10:10:00  208.  144.
# 3 2017-09-20 10:40:00  285.  224.

Data:

df <- structure(list(Date = structure(c(1505917199, 1505917800, 1505918400, 
1505919000, 1505919600, 1505920200, 1505920800, 1505921400, 1505922000, 
1505922599), class = c("POSIXct", "POSIXt"), tzone = ""), Value = c(96.51, 
113.29, 128.37, 128.62, 94.08, 208.15, 178.82, 208.44, 285.49, 
305.02)), row.names = c(NA, -10L), class = c("tbl_df", "tbl", 
"data.frame"))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM