[英]Building a new column in R tidyverse based on values of other columns
I am using R tidyverse and I have a tibble like the following code.我正在使用 R tidyverse,我有一个类似下面代码的小标题。 I am trying to create the output_column based on the values of other columns.
我正在尝试根据其他列的值创建 output_column。 The data comes from the last non-empty column plus NA if there is an NA column before output_column.
如果output_column之前有NA列,则数据来自最后一个非空列加上NA。
library(tidyverse)
test_df <-
tibble(kingdom = rep("bacteria",6),
phylum = c(NA, "sterp", rep("entro", 4)),
class = c(rep(NA, 2), rep("abc",4)),
order= c(rep(NA,3), rep("cde", 3)),
family= c(rep(NA,4), rep("xyz", 2)),
genus= c(rep(NA,5), "sam"),
output_column = c("bacteria_NA", "sterp_NA", "abc_NA", "cde_NA", "xyz_NA", "sam" ))
You can use row_wise()
and c_across()
, as follows:您可以使用
row_wise()
和c_across()
,如下所示:
test_df %>%
rowwise() %>%
mutate(k = if_else(!is.na(genus), genus, paste0(last(c_across()[!is.na(c_across())]), "_NA")))
Output: Output:
kingdom phylum class order family genus k
<chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 bacteria NA NA NA NA NA bacteria_NA
2 bacteria sterp NA NA NA NA sterp_NA
3 bacteria entro abc NA NA NA abc_NA
4 bacteria entro abc cde NA NA cde_NA
5 bacteria entro abc cde xyz NA xyz_NA
6 bacteria entro abc cde xyz sam sam
Another approach is to use last()
function with apply()
:另一种方法是将
last()
function 与apply()
一起使用:
test_df$output_column = apply(
test_df, 1, \(x) {
if_else(is.na(last(x)), paste0(last(x[!is.na(x)]), "_NA"), last(x))
}
)
We may use coalesce
here我们可以在这里使用
coalesce
library(dplyr)
library(purrr)
library(stringr)
test_df %>%
mutate(output_column = invoke(coalesce, across(last_col():1)),
output_column = case_when(if_any((last_col()-1):1, is.na)~
str_c(output_column, '_NA'), TRUE ~ output_column))
-output -输出
# A tibble: 6 × 7
kingdom phylum class order family genus output_column
<chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 bacteria <NA> <NA> <NA> <NA> <NA> bacteria_NA
2 bacteria sterp <NA> <NA> <NA> <NA> sterp_NA
3 bacteria entro abc <NA> <NA> <NA> abc_NA
4 bacteria entro abc cde <NA> <NA> cde_NA
5 bacteria entro abc cde xyz <NA> xyz_NA
6 bacteria entro abc cde xyz sam sam
test_df <- structure(list(kingdom = c("bacteria", "bacteria", "bacteria",
"bacteria", "bacteria", "bacteria"), phylum = c(NA, "sterp",
"entro", "entro", "entro", "entro"), class = c(NA, NA, "abc",
"abc", "abc", "abc"), order = c(NA, NA, NA, "cde", "cde", "cde"
), family = c(NA, NA, NA, NA, "xyz", "xyz"), genus = c(NA, NA,
NA, NA, NA, "sam")), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.