简体   繁体   中英

in R how to apply a value of a column to multiple columns in the same data frame

I have a dataframe (df) like this

n   g    count  s_a s_b s_c .....
T1 gb    10000   0    1   0
T1 ga,gb 15000   1    1   0

And I looked at the values in s_a ... s_n to see which is the highest n

top_n <- names(sort(colSums(df[4:ncol(df]), decreasing=TRUE))[1:n]

Values of top_n are column name that has colsum(colname) in the top N

I want to use the results of top_n to update each column whose name = to top_n with the value in column count

n   g    count  s_a    s_b    s_c .....
T1 gb    10000   0     10000   0
T1 ga,gb 15000   15000 15000   0

We can use lapply to loop over the columns of interest ( 4:ncol(df) ) , multiply it with the 'count' and assign the output back to the original columns.

df[4:ncol(df)] <- lapply(df[4:ncol(df)], `*`, df$count)

Or with Map , we can do the same by multiplying the corresponding elements

df[4:ncol(df)] <- Map(`*`, df[4*ncol(df)], list(df$count))

Using data.table v1.9.7 , we can do an lapply based method (similar to first base R method). Convert the 'data.frame' to 'data.table' ( setDT(df) ), specify the columns of interest in .SDcols , loop though the columns, multiply with the 'count', and assign ( := ) the output back to the original columns.

library(data.table)
setDT(df)[, (4:ncol(df)) := lapply(.SD, `*`, count), .SDcols = 4:ncol(df)]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM