簡體   English   中英

創建一個列,根據另一行中的事件為數據框中的一行賦值

[英]Create a column that assigns value to a row in a dataframe based on an event in another row

我有一個結構如下的數據框:

example <- data.frame(id = c(1,1,1,1,1,1,1,2,2,2,2,2),
                      event = c("email","email","email","draw","email","email","draw","email","email","email","email","draw"),
                      date = c("2020-03-01","2020-06-01","2020-07-15","2020-07-28","2020-08-07","2020-09-01","2020-09-15","2020-05-22","2020-06-15","2020-07-13","2020-07-15","2020-07-31"),
                      amount = c(NA,NA,NA,10000,NA,NA,1500,NA,NA,NA,NA,2200))

這是數據框的簡化版本。 我正在嘗試創建一個列,該列將在繪制事件之前為最后一封電子郵件分配一個 1,以及一個將在與電子郵件相同的行上繪制的金額的列。 所需的數據框如下所示:

desiredResult <- data.frame(id = c(1,1,1,1,1,1,1,2,2,2,2,2),
                      event = c("email","email","email","draw","email","email","draw","email","email","email","email","draw"),
                      date = c("2020-03-01","2020-06-01","2020-07-15","2020-07-28","2020-08-07","2020-09-01","2020-09-15","2020-05-22","2020-06-15","2020-07-13","2020-07-15","2020-07-31"),
                      amount = c(NA,NA,NA,10000,NA,NA,1500,NA,NA,NA,NA,2200),
                      EmailBeforeDrawFlag = c(NA,NA,1,NA,NA,1,NA,NA,NA,NA,1,NA),
                      EmailBeforeDrawAmount = c(NA,NA,10000,NA,NA,1500,NA,NA,NA,NA,2200,NA))

這是dplyr解決方案。 當你創建新列,要使用if_else()中的定義EmailBeforeDrawFlag檢驗一個條件,而lead功能上一行去尋找event EmailBeforeDrawAmount是突出的lead(amount)

example %>%
  mutate(EmailBeforeDrawFlag = if_else(lead(event) == "draw", 1, NA_real_ ),
         EmailBeforeDrawAmount = lead(amount))
   id event       date amount EmailBeforeDrawFlag EmailBeforeDrawAmount
1   1 email 2020-03-01     NA                  NA                    NA
2   1 email 2020-06-01     NA                  NA                    NA
3   1 email 2020-07-15     NA                   1                 10000
4   1  draw 2020-07-28  10000                  NA                    NA
5   1 email 2020-08-07     NA                  NA                    NA
6   1 email 2020-09-01     NA                   1                  1500
7   1  draw 2020-09-15   1500                  NA                    NA
8   2 email 2020-05-22     NA                  NA                    NA
9   2 email 2020-06-15     NA                  NA                    NA
10  2 email 2020-07-13     NA                  NA                    NA
11  2 email 2020-07-15     NA                   1                  2200
12  2  draw 2020-07-31   2200                  NA                    NA

我們還可以利用NA^lead上創建列

library(dplyr)
example %>%
      mutate(EmailBeforeDrawFlag = NA^(lead(event != 'draw')), 
             EmailBeforeDrawAmount = lead(amount))

-輸出

#    id event       date amount EmailBeforeDrawFlag EmailBeforeDrawAmount
#1   1 email 2020-03-01     NA                  NA                    NA
#2   1 email 2020-06-01     NA                  NA                    NA
#3   1 email 2020-07-15     NA                   1                 10000
#4   1  draw 2020-07-28  10000                  NA                    NA
#5   1 email 2020-08-07     NA                  NA                    NA
#6   1 email 2020-09-01     NA                   1                  1500
#7   1  draw 2020-09-15   1500                  NA                    NA
#8   2 email 2020-05-22     NA                  NA                    NA
#9   2 email 2020-06-15     NA                  NA                    NA
#10  2 email 2020-07-13     NA                  NA                    NA
#11  2 email 2020-07-15     NA                   1                  2200
#12  2  draw 2020-07-31   2200                  NA                    NA

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM