简体   繁体   English

如何在 r dplyr 中正确使用滞后 function?

[英]How to use the lag function correctly in r dplyr?

I get the below incorrect output for the last cell in column reSeq when running the R/dplyr code immediately beneath.当在下面运行 R/dplyr 代码时,我得到下面不正确的 output 列reSeq中的最后一个单元格。 The code produces a value of 8 in that last cell of column reSeq , when via the lag() function in the code it should instead produce a 7. What is wrong with my use of the lag() function?代码在reSeq列的最后一个单元格中生成值 8,当通过代码中的lag() function 时,它应该生成 7。我使用lag() function 有什么问题? Also see image at the bottom that better explains what I am trying to do.另请参阅底部的图像,它可以更好地解释我正在尝试做的事情。

   Element Group eleCnt reSeq
   <chr>   <dbl>  <int> <int>
 1 R           0      1     1
 2 R           0      2     2
 3 X           0      1     1
 4 X           1      2     2
 5 X           1      3     2
 6 X           0      4     4
 7 X           0      5     5
 8 X           0      6     6
 9 B           0      1     1
10 R           0      3     3
11 R           2      4     4
12 R           2      5     4
13 X           3      7     7
14 X           3      8     7
15 X           3      9     8

library(dplyr)

myDF <- data.frame(
  Element = c("R","R","X","X","X","X","X","X","B","R","R","R","X","X","X"),
  Group = c(0,0,0,1,1,0,0,0,0,0,2,2,3,3,3)
)

myDF %>% 
  group_by(Element) %>%
    mutate(eleCnt = row_number()) %>%
  ungroup()%>%
  mutate(reSeq = eleCnt) %>%
  mutate(reSeq = ifelse(
    Element == lag(Element)& Group == lag(Group) & Group > 0, 
    lag(reSeq),
    eleCnt)
  )

The above is an attempted translation from Excel as show in this image below.以上是 Excel 的尝试翻译,如下图所示。 I am new to R, migrating over from Excel.我是 R 的新手,从 Excel 迁移过来。 I am trying to replicate the column D "Target", highlighted in yellow with the formula to the right.我正在尝试复制 D 列“目标”,以黄色突出显示,右侧的公式。 The below shows the correct output, including the desired 7 in cell D17 which I can't replicate with the above R code.下面显示了正确的 output,包括单元格 D17 中所需的 7,我无法使用上述 R 代码复制。

在此处输入图像描述

Breaking the derivation of "Target" down into 2 columns, Step1 and Step2, highlighted in yellow and blue in the below image (Step2 below is same as Target in above image)(2 steps is how I got the R code working as shown in one of the solutions):将“目标”的推导分解为 2 列,步骤 1 和步骤 2,在下图中以黄色和蓝色突出显示(下面的步骤 2 与上图中的目标相同)(2 步是我如何让 R 代码工作,如图所示解决方案之一):

在此处输入图像描述

The below code works.下面的代码有效。 I broke the Excel "Target" calculation into 2 steps in the 2nd image in the OP in order to reflect the step-wise R solution.为了反映逐步 R 解决方案,我将 Excel“目标”计算分解为 OP 中第二张图像中的两个步骤。

library(dplyr)
library(tidyr)

myDF <- data.frame(
  Element = c("R","R","X","X","X","X","X","X","B","R","R","R","X","X","X"),
  Group = c(0,0,0,1,1,0,0,0,0,0,2,2,3,3,3)
)

myDF %>% 
  group_by(Element) %>%
    mutate(eleCnt = row_number()) %>%
  ungroup()%>%
  mutate(reSeq = ifelse(Group == 0 | Group != lag(Group), eleCnt,0)) %>%
  mutate(reSeq = na_if(reSeq, 0)) %>%
  group_by(Element) %>%
    fill(reSeq) %>%
  ungroup

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM