[英]How to use the lag function correctly in r dplyr?
I get the below incorrect output for the last cell in column reSeq
when running the R/dplyr code immediately beneath.当在下面运行 R/dplyr 代码时,我得到下面不正确的 output 列reSeq
中的最后一个单元格。 The code produces a value of 8 in that last cell of column reSeq
, when via the lag()
function in the code it should instead produce a 7. What is wrong with my use of the lag()
function?代码在reSeq
列的最后一个单元格中生成值 8,当通过代码中的lag()
function 时,它应该生成 7。我使用lag()
function 有什么问题? Also see image at the bottom that better explains what I am trying to do.另请参阅底部的图像,它可以更好地解释我正在尝试做的事情。
Element Group eleCnt reSeq
<chr> <dbl> <int> <int>
1 R 0 1 1
2 R 0 2 2
3 X 0 1 1
4 X 1 2 2
5 X 1 3 2
6 X 0 4 4
7 X 0 5 5
8 X 0 6 6
9 B 0 1 1
10 R 0 3 3
11 R 2 4 4
12 R 2 5 4
13 X 3 7 7
14 X 3 8 7
15 X 3 9 8
library(dplyr)
myDF <- data.frame(
Element = c("R","R","X","X","X","X","X","X","B","R","R","R","X","X","X"),
Group = c(0,0,0,1,1,0,0,0,0,0,2,2,3,3,3)
)
myDF %>%
group_by(Element) %>%
mutate(eleCnt = row_number()) %>%
ungroup()%>%
mutate(reSeq = eleCnt) %>%
mutate(reSeq = ifelse(
Element == lag(Element)& Group == lag(Group) & Group > 0,
lag(reSeq),
eleCnt)
)
The above is an attempted translation from Excel as show in this image below.以上是 Excel 的尝试翻译,如下图所示。 I am new to R, migrating over from Excel.我是 R 的新手,从 Excel 迁移过来。 I am trying to replicate the column D "Target", highlighted in yellow with the formula to the right.我正在尝试复制 D 列“目标”,以黄色突出显示,右侧的公式。 The below shows the correct output, including the desired 7 in cell D17 which I can't replicate with the above R code.下面显示了正确的 output,包括单元格 D17 中所需的 7,我无法使用上述 R 代码复制。
Breaking the derivation of "Target" down into 2 columns, Step1 and Step2, highlighted in yellow and blue in the below image (Step2 below is same as Target in above image)(2 steps is how I got the R code working as shown in one of the solutions):将“目标”的推导分解为 2 列,步骤 1 和步骤 2,在下图中以黄色和蓝色突出显示(下面的步骤 2 与上图中的目标相同)(2 步是我如何让 R 代码工作,如图所示解决方案之一):
The below code works.下面的代码有效。 I broke the Excel "Target" calculation into 2 steps in the 2nd image in the OP in order to reflect the step-wise R solution.为了反映逐步 R 解决方案,我将 Excel“目标”计算分解为 OP 中第二张图像中的两个步骤。
library(dplyr)
library(tidyr)
myDF <- data.frame(
Element = c("R","R","X","X","X","X","X","X","B","R","R","R","X","X","X"),
Group = c(0,0,0,1,1,0,0,0,0,0,2,2,3,3,3)
)
myDF %>%
group_by(Element) %>%
mutate(eleCnt = row_number()) %>%
ungroup()%>%
mutate(reSeq = ifelse(Group == 0 | Group != lag(Group), eleCnt,0)) %>%
mutate(reSeq = na_if(reSeq, 0)) %>%
group_by(Element) %>%
fill(reSeq) %>%
ungroup
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.