[英]How do I create a third column based on Character Values of other columns, excluding NA and values?
How can I create a new column called 'title' based on values of other columns attributes? 如何基于其他列属性的值创建一个名为“ title”的新列?
I have shown the example below, where 'title' needs to be created based on the columns Post, Tel, Surname, and Emp. 我已经显示了以下示例,其中需要根据“邮政”,“电话”,“姓氏”和“ Emp”列创建“ title”。 'title' just indicates which values are not NA. “标题”仅表示哪些值不适用。 I have this 我有这个
ID1 ID2 Post Tel Surname Emp
<chr> <chr> <chr> <chr> <chr> <chr>
1 S04 S03 NA 369 990247 NA NA
2 S14 S08 NA 069 990351 NA NA
3 S18 S03 N165HT NA Jones NA
4 S19 S13 NA 3069 90685 NA NA
5 S20 S16 NA 3069 90954 NA NA
6 S20 S17 CO19RF NA NA Ocean
And I want to create this: 我想创建这个:
ID1 ID2 Post Tel Surname Emp title
<chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 S04 S03 NA 369 990247 NA NA Tel
2 S14 S08 NA 069 990351 NA NA Tel
3 S18 S03 N165HT NA Jones NA Post,Surname
4 S19 S13 NA 3069 90685 NA NA Tel
5 S20 S16 NA 3069 90954 NA NA Tel
6 S20 S17 CO19RF NA NA Ocean Post,Emp
An option here would be to gather
into 'long' format (while removing the NA elements with na.rm = TRUE
) after creating a unique row identiier ('rn'), grouped by 'rn', paste
the 'key' elements in summarise
and bind with the original dataset 在这里创建一个唯一的行标识符(“ rn”)(按“ rn”分组)后,将“ key”元素paste
到“长”格式(同时使用na.rm = TRUE
删除NA元素)后,可以将其gather
为“长”格式summarise
并与原始数据集绑定
library(tidyverse)
df1 %>%
rownames_to_column('rn') %>%
gather(key, val, Post:Emp, na.rm = TRUE) %>%
group_by(rn) %>%
summarise(title = toString(key)) %>%
ungroup %>%
select(-rn) %>%
bind_cols(df1, .)
# ID1 ID2 Post Tel Surname Emp title
#1 S04 S03 <NA> 369 990247 <NA> <NA> Tel
#2 S14 S08 <NA> 069 990351 <NA> <NA> Tel
#3 S18 S03 N165HT <NA> Jones <NA> Post, Surname
#4 S19 S13 <NA> 3069 90685 <NA> <NA> Tel
#5 S20 S16 <NA> 3069 90954 <NA> <NA> Tel
#6 S20 S17 CO19RF <NA> <NA> Ocean Post, Emp
df1 <- structure(list(ID1 = c("S04", "S14", "S18", "S19", "S20", "S20"
), ID2 = c("S03", "S08", "S03", "S13", "S16", "S17"), Post = c(NA,
NA, "N165HT", NA, NA, "CO19RF"), Tel = c("369 990247", "069 990351",
NA, "3069 90685", "3069 90954", NA), Surname = c(NA, NA, "Jones",
NA, NA, NA), Emp = c(NA, NA, NA, NA, NA, "Ocean")), row.names = c("1",
"2", "3", "4", "5", "6"), class = "data.frame")
In Base R: 在Base R中:
cols <- c("Post", "Tel", "Surname", "Emp")
d$title <- apply(d[, cols], 1, function(x){
paste(cols[which(!is.na(x))], collapse = ",")
})
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.