简体   繁体   English

根据其他列的值在 R tidyverse 中构建一个新列

[英]Building a new column in R tidyverse based on values of other columns

I am using R tidyverse and I have a tibble like the following code.我正在使用 R tidyverse,我有一个类似下面代码的小标题。 I am trying to create the output_column based on the values of other columns.我正在尝试根据其他列的值创建 output_column。 The data comes from the last non-empty column plus NA if there is an NA column before output_column.如果output_column之前有NA列,则数据来自最后一个非空列加上NA。

library(tidyverse)
test_df <- 
tibble(kingdom = rep("bacteria",6),
       phylum = c(NA, "sterp", rep("entro", 4)),
       class = c(rep(NA, 2), rep("abc",4)),
       order= c(rep(NA,3), rep("cde", 3)),
       family= c(rep(NA,4), rep("xyz", 2)),
       genus= c(rep(NA,5), "sam"),
       output_column = c("bacteria_NA", "sterp_NA", "abc_NA", "cde_NA", "xyz_NA", "sam" ))

在此处输入图像描述

You can use row_wise() and c_across() , as follows:您可以使用row_wise()c_across() ,如下所示:

test_df %>%
  rowwise() %>%
  mutate(k = if_else(!is.na(genus), genus, paste0(last(c_across()[!is.na(c_across())]), "_NA")))

Output: Output:

  kingdom  phylum class order family genus k          
  <chr>    <chr>  <chr> <chr> <chr>  <chr> <chr>      
1 bacteria NA     NA    NA    NA     NA    bacteria_NA
2 bacteria sterp  NA    NA    NA     NA    sterp_NA   
3 bacteria entro  abc   NA    NA     NA    abc_NA     
4 bacteria entro  abc   cde   NA     NA    cde_NA     
5 bacteria entro  abc   cde   xyz    NA    xyz_NA     
6 bacteria entro  abc   cde   xyz    sam   sam    

Another approach is to use last() function with apply() :另一种方法是将last() function 与apply()一起使用:

test_df$output_column = apply(
  test_df, 1, \(x) {
    if_else(is.na(last(x)), paste0(last(x[!is.na(x)]), "_NA"), last(x))
  }
)

We may use coalesce here我们可以在这里使用coalesce

library(dplyr)
library(purrr)
library(stringr)
 test_df %>%
   mutate(output_column = invoke(coalesce, across(last_col():1)), 
   output_column = case_when(if_any((last_col()-1):1, is.na)~ 
       str_c(output_column, '_NA'), TRUE ~ output_column))

-output -输出

# A tibble: 6 × 7
  kingdom  phylum class order family genus output_column
  <chr>    <chr>  <chr> <chr> <chr>  <chr> <chr>        
1 bacteria <NA>   <NA>  <NA>  <NA>   <NA>  bacteria_NA  
2 bacteria sterp  <NA>  <NA>  <NA>   <NA>  sterp_NA     
3 bacteria entro  abc   <NA>  <NA>   <NA>  abc_NA       
4 bacteria entro  abc   cde   <NA>   <NA>  cde_NA       
5 bacteria entro  abc   cde   xyz    <NA>  xyz_NA       
6 bacteria entro  abc   cde   xyz    sam   sam        

data数据

test_df <- structure(list(kingdom = c("bacteria", "bacteria", "bacteria", 
"bacteria", "bacteria", "bacteria"), phylum = c(NA, "sterp", 
"entro", "entro", "entro", "entro"), class = c(NA, NA, "abc", 
"abc", "abc", "abc"), order = c(NA, NA, NA, "cde", "cde", "cde"
), family = c(NA, NA, NA, NA, "xyz", "xyz"), genus = c(NA, NA, 
NA, NA, NA, "sam")), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame"))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM