简体   繁体   English

R按组用NA替换最后n个值

[英]R replace last nth value with NA by group

I want to replace value(s) with NA by group.我想按组用 NA 替换值。

have <- data.frame(id = c(1,1,1,1,2,2,2),
                   value = c(1,2,3,4,5,6,7))

want1 <- data.frame(id = c(1,1,1,1,2,2,2),
                    value = c(1,2,3,NA,5,6,NA))

want2 <- data.frame(id = c(1,1,1,1,2,2,2),
                    value = c(1,2,NA,NA,5,NA,NA))

want1 corresponds to replacing the last obs of value with NA and want2 corresponds to replacing last obs of value & last 2nd value with NA. want1 对应于用 NA 替换最后一个 obs 值,want2 对应于用 NA 替换最后一个 obs 值和最后一个 2nd 值。 I'm currently trying to do with with dplyr package but can't seem to get any traction.我目前正在尝试使用 dplyr 包,但似乎没有任何吸引力。 Any help would be much appreciated.任何帮助将非常感激。 Thanks!谢谢!

We can use row_number() to test the current row against n() the total rows in the group.我们可以使用row_number()来测试当前行与n()组中的总行数。

have |>
  group_by(id) |>
  mutate(
    last1 = ifelse(row_number() == n(), NA, value),
    last2 = ifelse(row_number() >= n() - 1, NA, value)
  )
# # A tibble: 7 × 4
# # Groups:   id [2]
#      id value last1 last2
#   <dbl> <dbl> <dbl> <dbl>
# 1     1     1     1     1
# 2     1     2     2     2
# 3     1     3     3    NA
# 4     1     4    NA    NA
# 5     2     5     5     5
# 6     2     6     6    NA
# 7     2     7    NA    NA

And a general way to provide variants as different data frames.以及提供变体作为不同数据帧的一般方法。

lapply(
  1:2,
  function(k) {
    have %>% 
      group_by(id) %>% 
      mutate(value=ifelse(row_number() <= (n() - k), value, NA))
  }
)
[[1]]
# A tibble: 7 × 2
# Groups:   id [2]
     id value
  <dbl> <dbl>
1     1     1
2     1     2
3     1     3
4     1    NA
5     2     5
6     2     6
7     2    NA

[[2]]
# A tibble: 7 × 2
# Groups:   id [2]
     id value
  <dbl> <dbl>
1     1     1
2     1     2
3     1    NA
4     1    NA
5     2     5
6     2    NA
7     2    NA

Here is a base R way.这是一个基本的 R 方式。

have <- data.frame(id = c(1,1,1,1,2,2,2),
                   value = c(1,2,3,4,5,6,7))

want1 <- data.frame(id = c(1,1,1,1,2,2,2),
                    value = c(1,2,3,NA,5,6,NA))

want2 <- data.frame(id = c(1,1,1,1,2,2,2),
                    value = c(1,2,NA,NA,5,NA,NA))

with(have, ave(value, id, FUN = \(x){
  x[length(x)] <- NA
  x
}))
#> [1]  1  2  3 NA  5  6 NA
with(have, ave(value, id, FUN = \(x){
  x[length(x)] <- NA
  if(length(x) > 1)
    x[length(x) - 1L] <- NA
  x
}))
#> [1]  1  2 NA NA  5 NA NA

Created on 2022-06-09 by the reprex package (v2.0.1)reprex 包(v2.0.1) 创建于 2022-06-09

Then reassign these results to column value .然后将这些结果重新分配给列value

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM