I have a df with three columns. Every column has a character or NA and every row has only one character. As this example:
df <- data.frame(a=c("NA","NA","NA","NA","fruits","fruits","fruits","fruits","fruits","fruits"),
b=c("NA","NA","veggies","veggies","NA","NA","NA","NA","NA","NA"),
c=c("nuts","nuts","NA","NA","NA","NA","NA","NA","NA","NA") )
I want to combine all three columns, to get this:
1 nuts
2 nuts
3 veggies
4 veggies
5 fruits
6 fruits
7 fruits
8 fruits
9 fruits
10 fruits
Using numeric values I would use aggregate
with na.rm=TRUE
. However, I don't have any idea how to do this with characters. Ideas? Thanks
We can use max.col
after converting the string "NA" to real NA
. We get the row/column index with max.col
, extract the values and then convert tot data.frame
.
is.na(df) <- df=='NA'
data.frame(var=df[cbind(1:nrow(df),max.col(!is.na(df)))])
# var
#1 nuts
#2 nuts
#3 veggies
#4 veggies
#5 fruits
#6 fruits
#7 fruits
#8 fruits
#9 fruits
#10 fruits
Or another option would be
data.frame(var= df[cbind(1:nrow(df),(+!is.na(df)) %*% seq_along(df))])
To polish the ideas provided in comments, you can do this:
data.frame(var = apply(df, 1, function(x) paste(gsub("NA", "", x), collapse = "")) )
var
1 nuts
2 nuts
3 veggies
4 veggies
5 fruits
6 fruits
7 fruits
8 fruits
9 fruits
10 fruits
The actual data situation may determine whether is is better or worse than the line-by-line method. Here's one way to get a print-out like what you specify:
> as.matrix( df[df!="NA"] )
Or probably better:
> cat( paste( "\n", df[ df!="NA" ] ) )
fruits
fruits
fruits
fruits
fruits
fruits
veggies
veggies
nuts
nuts
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.