简体   繁体   English

如何在R中使用dplyr跨列减去具有奇数行号的偶数行号

[英]How to subtract even row numbers with odd row numbers across columns using dplyr in R

My data frame looks like this我的数据框看起来像这样

df <-data.frame(col1=c(1,2,3,4), col2=c(5,6,7,8), time=rep(c("0h","72h"),2))
  col1 col2 time
1    1    5   0h
2    2    6  72h
3    3    7   0h
4    4    8  72h

I want to use the mutate_across or any other dplyr function (preferably) to subtract the values of the 72h with the values of the 0h from the previous row in each column.我想使用 mutate_across 或任何其他 dplyr 函数(最好)从每列的前一行中减去 72h 的值和 0h 的值。

I would like my data to look like this我希望我的数据看起来像这样

  col1 col2 time
     1    1   72h
     1    1   72h

base根据

df <-data.frame(col1=c(1,2,3,4), col2=c(5,6,7,8), time=rep(c(0,72),2))

df[c(FALSE,TRUE), ] - df[c(TRUE, FALSE), ]
#>   col1 col2 time
#> 2    1    1   72
#> 4    1    1   72

Created on 2021-07-06 by the reprex package (v2.0.0)reprex 包( v2.0.0 ) 于 2021 年 7 月 6 日创建

tidyverse using the approach @Emir Dakin tidyverse使用方法@Emir Dakin

library(tidyverse)
df <-data.frame(col1=c(1,2,3,4), col2=c(5,6,7,8), time=rep(c("0h", "72h"),2))

df %>%
  mutate(across(where(is.numeric), ~.x - lag(.x, default = first(.x)))) %>%
  filter(time == "72h")
#>   col1 col2 time
#> 1    1    1  72h
#> 2    1    1  72h

Created on 2021-07-06 by the reprex package (v2.0.0)reprex 包( v2.0.0 ) 于 2021 年 7 月 6 日创建

You can use the lag function if the data is neatly ordered as you've shown.如果数据按照您显示的方式排列整齐,您可以使用lag函数。 This is a very straight-forward application but it should work, I don't think you need anything else than mutate :这是一个非常简单的应用程序,但它应该可以工作,我认为除了mutate之外你不需要其他任何东西:

df %>%
  mutate(col1 = col1 - lag(col1, default = first(col1)),
         col2 = col2 - lag(col2, default = first(col2))) %>%
  filter(time == "72h")

With the answer by Emir Dakin, I have added a control with the sequence of occurrence of time:通过埃米尔达金的回答,我添加了一个带有时间发生顺序的控件:

library(dplyr)
df %>% group_by(time) %>% mutate(sl= seq(time)) %>% group_by(sl) %>% 
  mutate(col1 = col1 - lag(col1, default = first(col1), order_by = time), 
         col2 = col2 - lag(col2, default = first(col2), order_by = time))  %>% 
  ungroup() %>% filter(time  == "72h") %>% select(col1, col2, time) 

# A tibble: 2 x 3
   col1  col2 time 
  <dbl> <dbl> <chr>
1     1     1 72h  
2     1     1 72h  

Or:或者:

library(tidyverse)

df <-data.frame(col1=c(1,2,3,4), col2=c(5,6,7,8), time=rep(c("0h","72h"),2))

df %>%
  mutate(id = rep(seq(nrow(df) / 2), each = 2), # create an id of what belongs together
         tmp = rep(c("start", "end"), nrow(df) / 2),
         time = as.numeric(str_remove(time, "h"))) %>%
  mutate_at(vars("col1":"time"), ~if_else(tmp == "start", .x * -1, .x)) %>%
  group_by(id) %>%
  summarise_at(vars("col1":"time"), sum) 

# # A tibble: 2 x 4
# id  col1  col2  time
# <int> <dbl> <dbl> <dbl>
# 1     1     1     1    72
# 2     2     1     1    72

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM