[英]Subtracting one column by another by the first row in every 5 rows in dataframe
Say I have a dataframe, df, such that:假设我有一个数据框 df,这样:
set.seed(123)
df1 <- 0
df1$visit <- rep(c("scr", "1mo", "3mo", "6mo0", "12mo", "2yr"), 2)
df1 <- as.data.frame(df1)
df1$id <- rep(c("101","102"), each = 6)
df1 <- df1[ ,-c(1)]
df1$x <- sample(0:30, 12, replace = T)
df1$want = c(0, -16, -12, -17, -28, -21, 0, 4, -7, -13, 2, -4)
What I would like to do: Subtract every row (X values) after the screening row (can be negative) to create a change variable from screening visit ONLY.我想做的是:减去筛选行(可以是负数)之后的每一行(X 值) ,以仅从筛选访问中创建更改变量。 So it's essentially looping through the set to calculate the change from screening visit, then repeats that for each ID/set of visits (this dummy set has essentially two ID's).所以它本质上是循环遍历集合来计算筛选访问的变化,然后对每个 ID/访问集重复这个过程(这个虚拟集基本上有两个 ID)。
I've tried: looking on here for similar answers, and closest I could get to was using mutate() from dplyr.我试过:在这里寻找类似的答案,我能得到的最接近的是使用 dplyr 中的 mutate()。 All answers I found either tell me how to subtract lagging or leading rows or mutate when certain conditions match.我找到的所有答案要么告诉我如何减去滞后行或前导行,要么在某些条件匹配时发生变异。
I could do this in excel maybe but I will reuse this frequently in future analyses.也许我可以在 excel 中做到这一点,但我会在未来的分析中经常重复使用它。
edit: added variable that would be exactly the right values.编辑:添加了完全正确值的变量。
This will work, we just need to group by id
, then take advantage of the first()
function to take the difference versus the first value of x for each group.这会起作用,我们只需id
分组,然后利用first()
函数来获取每个组的 x 与第一个值的差异。
library(tidyverse)
df1 %>% group_by(id) %>% mutate(new = x - first(x))
# A tibble: 12 x 5
# Groups: id [2]
visit id x want new
<fct> <chr> <int> <dbl> <int>
1 scr 101 30 0 0
2 1mo 101 14 -16 -16
3 3mo 101 18 -12 -12
4 6mo0 101 13 -17 -17
5 12mo 101 2 -28 -28
6 2yr 101 9 -21 -21
7 scr 102 17 0 0
8 1mo 102 21 4 4
9 3mo 102 10 -7 -7
10 6mo0 102 4 -13 -13
11 12mo 102 19 2 2
12 2yr 102 13 -4 -4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.