[英]How to combine multiple character columns into a single column in an R data frame
[英]R: How to 'aggregate' (or combine) character columns?
我有一個帶有三列的df。 每列都有一個字符或NA,每行只有一個字符。 作為此示例:
df <- data.frame(a=c("NA","NA","NA","NA","fruits","fruits","fruits","fruits","fruits","fruits"),
b=c("NA","NA","veggies","veggies","NA","NA","NA","NA","NA","NA"),
c=c("nuts","nuts","NA","NA","NA","NA","NA","NA","NA","NA") )
我想結合所有三列,以得到此:
1 nuts
2 nuts
3 veggies
4 veggies
5 fruits
6 fruits
7 fruits
8 fruits
9 fruits
10 fruits
使用數字值時,我將使用na.rm=TRUE
aggregate
。 但是,我不知道該如何處理角色。 有想法嗎? 謝謝
將字符串“ NA”轉換為實數NA
后,可以使用max.col
。 我們使用max.col
獲得行/列索引,提取值,然后將其轉換為data.frame
。
is.na(df) <- df=='NA'
data.frame(var=df[cbind(1:nrow(df),max.col(!is.na(df)))])
# var
#1 nuts
#2 nuts
#3 veggies
#4 veggies
#5 fruits
#6 fruits
#7 fruits
#8 fruits
#9 fruits
#10 fruits
否則另一個選擇是
data.frame(var= df[cbind(1:nrow(df),(+!is.na(df)) %*% seq_along(df))])
要完善注釋中提供的想法,您可以執行以下操作:
data.frame(var = apply(df, 1, function(x) paste(gsub("NA", "", x), collapse = "")) )
var
1 nuts
2 nuts
3 veggies
4 veggies
5 fruits
6 fruits
7 fruits
8 fruits
9 fruits
10 fruits
實際數據情況可能確定是比逐行方法更好還是更壞。 這是一種獲得指定打印輸出的方法:
> as.matrix( df[df!="NA"] )
也許更好:
> cat( paste( "\n", df[ df!="NA" ] ) )
fruits
fruits
fruits
fruits
fruits
fruits
veggies
veggies
nuts
nuts
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.