[英]How to combine multiple character columns into a single column in an R data frame
[英]R: How to 'aggregate' (or combine) character columns?
我有一个带有三列的df。 每列都有一个字符或NA,每行只有一个字符。 作为此示例:
df <- data.frame(a=c("NA","NA","NA","NA","fruits","fruits","fruits","fruits","fruits","fruits"),
b=c("NA","NA","veggies","veggies","NA","NA","NA","NA","NA","NA"),
c=c("nuts","nuts","NA","NA","NA","NA","NA","NA","NA","NA") )
我想结合所有三列,以得到此:
1 nuts
2 nuts
3 veggies
4 veggies
5 fruits
6 fruits
7 fruits
8 fruits
9 fruits
10 fruits
使用数字值时,我将使用na.rm=TRUE
aggregate
。 但是,我不知道该如何处理角色。 有想法吗? 谢谢
将字符串“ NA”转换为实数NA
后,可以使用max.col
。 我们使用max.col
获得行/列索引,提取值,然后将其转换为data.frame
。
is.na(df) <- df=='NA'
data.frame(var=df[cbind(1:nrow(df),max.col(!is.na(df)))])
# var
#1 nuts
#2 nuts
#3 veggies
#4 veggies
#5 fruits
#6 fruits
#7 fruits
#8 fruits
#9 fruits
#10 fruits
否则另一个选择是
data.frame(var= df[cbind(1:nrow(df),(+!is.na(df)) %*% seq_along(df))])
要完善注释中提供的想法,您可以执行以下操作:
data.frame(var = apply(df, 1, function(x) paste(gsub("NA", "", x), collapse = "")) )
var
1 nuts
2 nuts
3 veggies
4 veggies
5 fruits
6 fruits
7 fruits
8 fruits
9 fruits
10 fruits
实际数据情况可能确定是比逐行方法更好还是更坏。 这是一种获得指定打印输出的方法:
> as.matrix( df[df!="NA"] )
也许更好:
> cat( paste( "\n", df[ df!="NA" ] ) )
fruits
fruits
fruits
fruits
fruits
fruits
veggies
veggies
nuts
nuts
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.