[英]How to subtract even row numbers with odd row numbers across columns using dplyr in R
My data frame looks like this我的数据框看起来像这样
df <-data.frame(col1=c(1,2,3,4), col2=c(5,6,7,8), time=rep(c("0h","72h"),2))
col1 col2 time
1 1 5 0h
2 2 6 72h
3 3 7 0h
4 4 8 72h
I want to use the mutate_across or any other dplyr function (preferably) to subtract the values of the 72h with the values of the 0h from the previous row in each column.我想使用 mutate_across 或任何其他 dplyr 函数(最好)从每列的前一行中减去 72h 的值和 0h 的值。
I would like my data to look like this我希望我的数据看起来像这样
col1 col2 time
1 1 72h
1 1 72h
base根据
df <-data.frame(col1=c(1,2,3,4), col2=c(5,6,7,8), time=rep(c(0,72),2))
df[c(FALSE,TRUE), ] - df[c(TRUE, FALSE), ]
#> col1 col2 time
#> 2 1 1 72
#> 4 1 1 72
Created on 2021-07-06 by the reprex package (v2.0.0)由reprex 包( v2.0.0 ) 于 2021 年 7 月 6 日创建
tidyverse using the approach @Emir Dakin tidyverse使用方法@Emir Dakin
library(tidyverse)
df <-data.frame(col1=c(1,2,3,4), col2=c(5,6,7,8), time=rep(c("0h", "72h"),2))
df %>%
mutate(across(where(is.numeric), ~.x - lag(.x, default = first(.x)))) %>%
filter(time == "72h")
#> col1 col2 time
#> 1 1 1 72h
#> 2 1 1 72h
Created on 2021-07-06 by the reprex package (v2.0.0)由reprex 包( v2.0.0 ) 于 2021 年 7 月 6 日创建
You can use the lag
function if the data is neatly ordered as you've shown.如果数据按照您显示的方式排列整齐,您可以使用
lag
函数。 This is a very straight-forward application but it should work, I don't think you need anything else than mutate
:这是一个非常简单的应用程序,但它应该可以工作,我认为除了
mutate
之外你不需要其他任何东西:
df %>%
mutate(col1 = col1 - lag(col1, default = first(col1)),
col2 = col2 - lag(col2, default = first(col2))) %>%
filter(time == "72h")
With the answer by Emir Dakin, I have added a control with the sequence of occurrence of time:通过埃米尔达金的回答,我添加了一个带有时间发生顺序的控件:
library(dplyr)
df %>% group_by(time) %>% mutate(sl= seq(time)) %>% group_by(sl) %>%
mutate(col1 = col1 - lag(col1, default = first(col1), order_by = time),
col2 = col2 - lag(col2, default = first(col2), order_by = time)) %>%
ungroup() %>% filter(time == "72h") %>% select(col1, col2, time)
# A tibble: 2 x 3
col1 col2 time
<dbl> <dbl> <chr>
1 1 1 72h
2 1 1 72h
Or:或者:
library(tidyverse)
df <-data.frame(col1=c(1,2,3,4), col2=c(5,6,7,8), time=rep(c("0h","72h"),2))
df %>%
mutate(id = rep(seq(nrow(df) / 2), each = 2), # create an id of what belongs together
tmp = rep(c("start", "end"), nrow(df) / 2),
time = as.numeric(str_remove(time, "h"))) %>%
mutate_at(vars("col1":"time"), ~if_else(tmp == "start", .x * -1, .x)) %>%
group_by(id) %>%
summarise_at(vars("col1":"time"), sum)
# # A tibble: 2 x 4
# id col1 col2 time
# <int> <dbl> <dbl> <dbl>
# 1 1 1 1 72
# 2 2 1 1 72
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.