简体   繁体   中英

How to use mutate() to create multiple variables from function with vector output?

Consider a tibble A with 2 columns of which column 1 contains time stamps (POSIXct class) and an Interval object b, which I have created using lubridate::int_diff, containing 9 individual time intervals.

Using dplyr, I would like to add 9 new columns to the tibble A, indicating whether the time stamp of each row falls within any of the intervals. Put differently, I would like to use the function %within% and distribute the vector output of length 9 across the 9 new columns.

What is the most effective using the dplyr package?

Example:

library(lubridate)
library(dplyr)

A <- tibble(Ts = ymd_hms(c("2018-01-01 15:12:04",
                       "2018-01-02 00:14:06","2018-01-05 12:00:00")),
        P = c(1:3))

ts.start <- ymd_hms("2018-01-01 15:00:00")
ts.end <- ymd_hms("2018-01-02 15:30:00")
ts <- c(ts.start,sort(ts.end - 
                    minutes(cumsum(c(15,15,30,30,60,60,60,60)))),ts.end)

b <- int_diff(ts)

# Applying %within" to the first element works
(A[[1,1]] %within% b) + 0

# The line with error.
mutate(A,New = Ts %within% b )

The last line produces an error as expected and would like to know how can define new variables based on applying a function with vector output on a variable column.

How about iterating through each element of Ts , checking within which interval it falls and append this to A ?

# iterate through each element and output a list of matches for each element which
# corresponds to a row
out <- sapply(A$Ts, FUN = function(x, y) x %within% y, y = b, simplify = FALSE)

# append result to original data
cbind(A, do.call(rbind, out))

                   Ts P     1     2     3     4     5     6     7     8     9
1 2018-01-01 15:12:04 1  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
2 2018-01-02 00:14:06 2  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
3 2018-01-05 12:00:00 3 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

You could use a zucchini plot (just made that up) to visualize into which interval the point belongs to.

library(ggplot2)

xy <- data.frame(id = 1:length(b), start = int_start(b), end = int_end(b))
head(xy)

ggplot(xy) +
  theme_bw() +
  scale_fill_gradient(low = "#324706", high = "#aeb776") +
  geom_rect(aes(xmin = start, xmax = end, ymin = 0, ymax = nrow(A) + 0.5, fill = id),
            color = "white") +
  geom_hline(yintercept = A$P + 0.5, color = "grey") +
  geom_point(data = A, aes(x = Ts, y = P), color = "white", size = 2) +
  geom_point(data = A, aes(x = Ts, y = P), color = "black", size = 2, shape = 1)

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM