![](/img/trans.png)
[英]R: Repeat value until new value appears by group, only once first non-NA value appears
[英]Create new variable that is 0 until the first non-NA value of another variable, then 1 thereafter (within a group)
我有以下df:
df <- tibble(country = c("US", "US", "US", "US", "US", "US", "US", "US", "US", "Mex", "Mex"),
year = c(1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2000, 2001),
score = c(NA, NA, NA, NA, 426, NA, NA, 430, NA, 450, NA))
我今天准備這樣做:創建一個新的變量before_after
是0,直到第一年,一個國家對非NA值score
,然后其后是1。
換句話說,對它進行硬編碼,我希望它返回以下df:
df <- tibble(country = c("US", "US", "US", "US", "US", "US", "US", "US", "US", "Mex", "Mex"),
year = c(1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2000, 2001),
score = c(NA, NA, NA, NA, 426, NA, NA, 430, NA, 450, NA),
before_after = c(0,0,0,0,1,1,1,1,1,1,1))
我嘗試了以下代碼,但無濟於事:
df %>%
arrange(year) %>%
group_by(country) %>%
mutate(before_after = ifelse(which.max(!is.na(score)),1,0)) %>%
arrange(country, year)
Tidyverse解決方案將不勝感激,但實際上任何幫助將不勝感激。
提前致謝!
你可以用cumsum
df %>%
arrange(country, year) %>%
group_by(country) %>%
mutate(before_after = ifelse(cumsum(!is.na(score)) > 0, 1, 0))
country year score before_after
<chr> <dbl> <dbl> <dbl>
1 Mex 2000 450 1
2 Mex 2001 NA 1
3 US 1999 NA 0
4 US 2000 NA 0
5 US 2001 NA 0
6 US 2002 NA 0
7 US 2003 426 1
8 US 2004 NA 1
9 US 2005 NA 1
10 US 2006 430 1
11 US 2007 NA 1
將group_by
與fill
結合使用:
library(tidyverse)
# create dataframe
df <- tibble(country = c("US", "US", "US", "US", "US", "US", "US", "US", "US", "Mex", "Mex"),
year = c(1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2000, 2001),
score = c(NA, NA, NA, NA, 426, NA, NA, 430, NA, 450, NA))
# create before_after variable with case_when
(df <- mutate(df, before_after = case_when(!is.na(score) ~ 1)))
# A tibble: 11 x 4
country year score before_after
<chr> <dbl> <dbl> <dbl>
1 Mex 2000 450 1
2 Mex 2001 NA NA
3 US 1999 NA NA
4 US 2000 NA NA
5 US 2001 NA NA
# run fill
df %>%
group_by(country) %>%
fill(before_after)
# A tibble: 11 x 4
# Groups: country [2]
country year score before_after
<chr> <dbl> <dbl> <dbl>
1 Mex 2000 450 1
2 Mex 2001 NA 1
3 US 1999 NA NA
4 US 2000 NA NA
5 US 2001 NA NA
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.