简体   繁体   中英

How do I replace 1's in data frame with the column name

I am working with a large data frame and I need to replace all the ones with the column names, but i can't figure out how to make it work. Anyone got any idea how to make this work.

Here is my data:

Names 35 40 45 50 55 60
a      1  0  1  0  0  0
b      0  0  0  1  0  0
c      0  1  0  1  1  0
d      1  0  0  0  0  1

Here is the code i have:

df[,-1] <- sapply(df[,-1], function(x) {ind <- which(x!=0); x[ind] = 
df[ind,1]; return(x)})

or

mat <- as.matrix(df[, -1])
pos <- which(mat != 0)
mat[pos] <- rep(df[[1]], times = ncol(mat))[pos]
new_dat <- "colnames<-"(cbind.data.frame(df[1], mat), colnames)

both of these are giving me the 1st row instead of the column headers.

Thank you for any help.

We create an index with col and then replace based on it

m1 <- col(df1[-1]) * df1[-1]
i1 <- m1 != 0
df1[-1][i1] <- rep(colnames(m1), each = nrow(m1))[i1]
df1
#   Names 35 40 45 50 55 60
#1     a 35  0 45  0  0  0
#2     b  0  0  0 50  0  0
#3     c  0 40  0 50 55  0
#4     d 35  0  0  0  0 60

NOTE: This should also work when the column names are not numeric. It is better not to have column names named with a number as starting


Or if it is numeric, we can simply multiply after replicating

df1[-1] <- df1[-1] * as.numeric(names(df1)[-1])[col(df1[-1])]

Or using a for loop

for(i in 2:ncol(df1)) df1[[i]][df1[[i]]==1] <- as.numeric(names(df1)[i])

data

df1 <- structure(list(Names = c("a", "b", "c", "d"), `35` = c(1L, 0L, 
0L, 1L), `40` = c(0L, 0L, 1L, 0L), `45` = c(1L, 0L, 0L, 0L), 
`50` = c(0L, 1L, 1L, 0L), `55` = c(0L, 0L, 1L, 0L), `60` = c(0L, 
0L, 0L, 1L)), class = "data.frame", row.names = c(NA, -4L
))

Assuming you only have 1 or 0 in your data frame, you can use the product of the data frame by colnames. Try out:

cbind(df[1], mapply(`*`, df[-1], as.numeric(colnames(df[-1]))))
# or just cbind(df[1], df[-1] * as.numeric(colnames(df[-1])))
# output
  Names 35 40 45 50 55 60
1     a 35  0 45  0  0  0
2     b  0  0  0 50  0  0
3     c  0 40  0 50 55  0
4     d 35  0  0  0  0 60
# data
df <- structure(list(Names = structure(1:4, .Label = c("a", "b", "c", 
"d"), class = "factor"), `35` = c(1L, 0L, 0L, 1L), `40` = c(0L, 
0L, 1L, 0L), `45` = c(1L, 0L, 0L, 0L), `50` = c(0L, 1L, 1L, 0L
), `55` = c(0L, 0L, 1L, 0L), `60` = c(0L, 0L, 0L, 1L)), .Names = c("Names", 
"35", "40", "45", "50", "55", "60"), class = "data.frame", row.names = c(NA, 
-4L))

This solution loops through and applies a simple ifelse() on each column:

df[-1] <- lapply(seq_along(df)[-1], function(x) ifelse(df[[x]] == 1, names(df)[x], df[[x]]))
df  

  Names 35 40 45 50 55 60
1     a 35  0 45  0  0  0
2     b  0  0  0 50  0  0
3     c  0 40  0 50 55  0
4     d 35  0  0  0  0 60

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM