简体   繁体   English

使用基于另一个 R 的四个日期列之一填充一列

[英]Fill a column with one of four date columns based on another R

I have a DF with 5 columns like so;我有一个像这样有 5 列的 DF;

A  B  Date1 Date2 Date3 Date4
1       x     NA    NA    NA
2      NA     y     NA    NA
3      NA    NA     z     NA  
4      NA    NA    NA     f

I want to use the dplyr package and the case_when() function to state something like this我想使用 dplyr package 和 case_when() function 到 Z9ED39E2EA931586B76A 之类的东西

df <- df %>%
    mutate(B = case_when(
     A == 1 ~ B == Date1,
     A == 2 ~ B == Date2,
     A == 3 ~ B == Date3,
     A == 4 ~ B == Date4))

Essentially based on the value of AI would like to fill B with one of 4 date columns.基本上基于 AI 的值想用 4 个日期列之一填充 B。

A is of class character, B and the Date are all class Date. A 是 class 字符,B 和日期都是 class 日期。

Problem is when I apply this to the dataframe it simply doesn't work.问题是当我将它应用于 dataframe 时,它根本不起作用。 It returns NAs and changes the class of B to boolean.它返回 NA 并将 B 的 class 更改为 boolean。 I am using R version 4.1.2.我正在使用 R 版本 4.1.2。 Any help is appreciated.任何帮助表示赞赏。

You can use coalesce() to find first non-missing element.您可以使用coalesce()来查找第一个非缺失元素。

library(dplyr)

df %>%
  mutate(B = coalesce(!!!df[-1]))

#   A Date1 Date2 Date3 Date4 B
# 1 1     x  <NA>  <NA>  <NA> x
# 2 2  <NA>     y  <NA>  <NA> y
# 3 3  <NA>  <NA>     z  <NA> z
# 4 4  <NA>  <NA>  <NA>     f f

The above code is just a shortcut of上面的代码只是一个快捷方式

df %>%
  mutate(B = coalesce(Date1, Date2, Date3, Date4))

If the B needs to be filled based on the value of A , then here is an idea with c_across() :如果B需要根据A的值填充,那么这里有一个c_across()的想法:

df %>%
  rowwise() %>%
  mutate(B = c_across(starts_with("Date"))[A]) %>%
  ungroup()

# # A tibble: 4 × 6
#       A Date1 Date2 Date3 Date4 B    
#   <int> <chr> <chr> <chr> <chr> <chr>
# 1     1 x     NA    NA    NA    x    
# 2     2 NA    y     NA    NA    y    
# 3     3 NA    NA    z     NA    z    
# 4     4 NA    NA    NA    f     f 

As it seems, you want diagonal values from columns with Date , you can use diag :看起来,您想要来自Date列的对角线值,您可以使用diag

df$B <- diag(as.matrix(df[grepl("Date", colnames(df))]))
#[1] "x" "y" "z" "f"

Other answers (if you want to coalesce):其他答案(如果你想合并):

  • With max : max
df$B <- apply(df[2:5], 1, \(x) max(x, na.rm = T))
  • With c_across :使用c_across
df %>% 
  rowwise() %>% 
  mutate(B = max(c_across(Date1:Date4), na.rm = T))

output output

  A Date1 Date2 Date3 Date4 B
1 1     x  <NA>  <NA>  <NA> x
2 2  <NA>     y  <NA>  <NA> y
3 3  <NA>  <NA>     z  <NA> z
4 4  <NA>  <NA>  <NA>     f f

The other answers are superior, but if you must use your current code for the actual application, the corrected version is:其他答案更好,但如果您必须将当前代码用于实际应用程序,更正后的版本是:

df %>%
  mutate(B = case_when(
    A == 1 ~ Date1,
    A == 2 ~ Date2,
    A == 3 ~ Date3,
    A == 4 ~ Date4))

Output: Output:

# A B Date1 Date2 Date3 Date4
# 1 x     x  <NA>  <NA>  <NA>
# 2 y  <NA>     y  <NA>  <NA>
# 3 z  <NA>  <NA>     z  <NA>
# 4 f  <NA>  <NA>  <NA>     f

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM