简体   繁体   中英

Approx function with group_by and across in R

I am currently interpolating a time-series and need to use the approx function in a dataframe with 4 columns and 172660 rows, but 4 groups (so its 43165 rows for each group). Currently, there's two answers about this: using summarise , but with the interpolation in just one column; and one using a datatable . The first approach indeed works, but not for my purpose. I also noted that using mutate_at, for example, is superseeded by mutate(across()) . So I was trying to use a more up-to-date approach, but it's not working.

library(tidyverse)
tabela_1 <- tibble(x1 = rnorm(4800, mean = 88.5, sd = 4),
                   x2 = rnorm(4800, mean = -38.526, sd = 2.758),
                   x3 = rnorm(4800, mean = -22.6852, sd = 1.8652),
                   x4 = rnorm(4800, mean = -38.526, sd = 2.758),
                   tmpts = rep(x = seq(from = 0, to = 863.28, by = 0.72), 
                               times = 4),
                   category = rep(x = 1:4, each = 1200))
tabela <- tibble(tmpts = rep(x = seq(from = 0, to = 863.28, by = 0.02), 
                             times = 4),
                 category = rep(x = 1:4, each = 43165))
        
tabela_joined <- tabela %>% 
            left_join(tabela_1, by = c("tmpts", "category")) %>% 
            arrange(category, tmpts) %>% 
            janitor::clean_names()
        
tabela_interpolation <- tabela_joined %>% 
            group_by(category) %>%
            summarize(across(.cols = x1:x4, approx(., n = 43165)))

When running tabela_interpolation , I receive:

Erro: Problem with `summarise()` input `..1`.
i `..1 = across(.cols = x1:x15, approx(., n = 43165))`.
x Can't convert an integer vector to function
i The error occurred in group 1: run = 1.
Run `rlang::last_error()` to see where the error occurred.
Além disso: Warning message:
In regularize.values(x, y, ties, missing(ties), na.rm = na.rm) :
  collapsing to unique 'x' values

How should I use summarise plus across to get the interpolated time-series from approx function in each column in the dataframe ?

You can use the across syntax as -

library(tidyverse)

tabela_joined %>% 
  group_by(category) %>%
  summarize(across(x1:x4, approx, n = 43165)) %>%
  ungroup

Or

tabela_joined %>% 
  group_by(category) %>%
  summarize(across(x1:x4, ~approx(., n = 43165))) %>%
  ungroup

This can be followed by unnest to get the complete expanded dataframe.

tabela_joined %>% 
  group_by(category) %>%
  summarize(across(x1:x4, approx, n = 43165)) %>%
  ungroup %>%
  unnest(x1:x4)

#   category    x1    x2    x3    x4
#      <int> <dbl> <dbl> <dbl> <dbl>
# 1        1     1     1     1     1
# 2        1     2     2     2     2
# 3        1     3     3     3     3
# 4        1     4     4     4     4
# 5        1     5     5     5     5
# 6        1     6     6     6     6
# 7        1     7     7     7     7
# 8        1     8     8     8     8
# 9        1     9     9     9     9
#10        1    10    10    10    10
# … with 345,310 more rows

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM