简体   繁体   中英

Multiply values across each column by weight in another data.frame in R

I have two data.frames : df and weights (code below).

df looks like this:

    id             a           b           d           EE           f
 1  this           0.23421153  -0.02324956 0.5457353   0.73068586   0.5642554
 2  is             0.28378641  0.36346241  1.0190496   1.97715019   -1.190179
 3  an             -0.04372133  0.07412557  -0.4510299  1.8442713   -0.5301328
 4  example        -0.18139887  0.10404329  -1.0871962  1.46920108  0.4685703
 5  data.frame     0.24235498   -0.1501064  -1.1183967  0.22884407  0.4272259
 6  for             -0.72718239 0.16337997  1.2635683   0.44206945  0.7303647
 7  stackoverflow  0.25203942   -0.1772715  -0.3371532  -0.29167792 -0.7065494
 8  please         -0.11047364  -0.06631552 0.4342659   -1.49584522 0.2837016
 9  help           -0.1136639   0.22414253  0.4284864   1.59096047  0.2915938
 10 me             -0.3677288   0.05974474  -0.1136177  0.02322094  -0.6533994

How can I multiply the values in each of the columns by the corresponding weights in the weights data.frame ?

Expected result:

    id            a             b           d            EE   f
 1  this          0.749476896   -0.1743717  5.29363241   NA   4.17548996
 2  is            0.908116512   2.725968075 9.88478112   NA   -8.8073246
 3  an            -0.139908256  0.555941775 -4.37499003  NA   -3.92298272
 4  example       -0.580476384  0.780324675 -10.54580314 NA   3.46742022
 5  data.frame    0.775535936   -1.125798   -10.84844799 NA   3.16147166
 6  for           -2.326983648  1.225349775 12.25661251  NA   5.40469878
 7  stackoverflow 0.806526144   -1.32953625 -3.27038604  NA   -5.22846556
 8  please        -0.353515648  -0.4973664  4.21237923   NA   2.09939184
 9  help          -0.36372448   1.681068975 4.15631808   NA   2.15779412
 10 me            -1.17673216   0.44808555  -1.10209169  NA   -4.83515556

Code:

 set.seed(12345)
 df <- data.frame(id=c("this", "is", "an", "example", "data.frame", "for", 
      "stackoverflow", "please", "help", "me"), a=rnorm(10,0,0.4), b=rnorm(10,0,0.2), 
      d=rnorm(10,0,0.7), EE=rnorm(10,0,0.9), f=rnorm(10,0,0.5))
 weights <- data.frame(V1=as.numeric(c("3.2", "7.5", "2.2", "9.7", "5.4", "7.4", "2.1", 
      "5.0", "3.3", "7.6", "3.6", "7.7", "7.1", "3.3", "9.8", "9.2", "2.5", "6.2", "4.1", "8.7", 
      "3.3", "9.3", "8.3")))
 rownames(weights) <- paste(letters[1:23])

You can use sweep and match -

df[-1] <- sweep(df[-1],2, weights$V1[match(names(df[-1]),rownames(weights))],`*`)

df
#              id          a          b          d EE         f
#1           this  0.7494769 -0.1743717   5.293633 NA  4.175490
#2             is  0.9081165  2.7259681   9.884781 NA -8.807325
#3             an -0.1399082  0.5559418  -4.374990 NA -3.922983
#4        example -0.5804764  0.7803247 -10.545803 NA  3.467420
#5     data.frame  0.7755359 -1.1257980 -10.848448 NA  3.161471
#6            for -2.3269836  1.2253498  12.256612 NA  5.404699
#7  stackoverflow  0.8065261 -1.3295363  -3.270386 NA -5.228465
#8         please -0.3535157 -0.4973664   4.212379 NA  2.099392
#9           help -0.3637245  1.6810690   4.156319 NA  2.157794
#10            me -1.1767322  0.4480855  -1.102092 NA -4.835156

Here is an alternative solution:

  1. Create column of rownames called Names in weights
  2. match column names from df with column Names of weights dataframe
  3. multiply with the corresponding V1 based on the numeric index by replicating the weights$V1
  4. Finally cbind to get id in df , which was removed by df[-1] for calculation purposes.
library(tibble)
# rownames to column
weights <- weights %>% 
  rownames_to_column(var = "Names")

df_result <- df[-1]*weights$V1[match(names(df[-1]), weights$Names)][col(df[-1])]
df_result <- cbind(id=df$id, df_result)
df_result

Output:

id          a          b          d EE         f
1           this  0.7494769 -0.1743717   5.293633 NA  4.175490
2             is  0.9081165  2.7259681   9.884781 NA -8.807325
3             an -0.1399082  0.5559418  -4.374990 NA -3.922983
4        example -0.5804764  0.7803247 -10.545803 NA  3.467420
5     data.frame  0.7755359 -1.1257980 -10.848448 NA  3.161471
6            for -2.3269836  1.2253498  12.256612 NA  5.404699
7  stackoverflow  0.8065261 -1.3295363  -3.270386 NA -5.228465
8         please -0.3535157 -0.4973664   4.212379 NA  2.099392
9           help -0.3637245  1.6810690   4.156319 NA  2.157794
10            me -1.1767322  0.4480855  -1.102092 NA -4.835156

With tidyverse , we can loop across the columns, get the corresponding column name ( cur_column() ), use that to subset the 'weight's data by specifying the row/column , names and multiply

library(dplyr)
df %>%
    mutate(across(where(is.numeric), ~ . * weights[cur_column(), 'V1']))
              id          a          b          d EE         f
1           this  0.7494769 -0.1743717   5.293633 NA  4.175490
2             is  0.9081165  2.7259681   9.884781 NA -8.807325
3             an -0.1399082  0.5559418  -4.374990 NA -3.922983
4        example -0.5804764  0.7803247 -10.545803 NA  3.467420
5     data.frame  0.7755359 -1.1257980 -10.848448 NA  3.161471
6            for -2.3269836  1.2253498  12.256612 NA  5.404699
7  stackoverflow  0.8065261 -1.3295363  -3.270386 NA -5.228465
8         please -0.3535157 -0.4973664   4.212379 NA  2.099392
9           help -0.3637245  1.6810690   4.156319 NA  2.157794
10            me -1.1767322  0.4480855  -1.102092 NA -4.835156

Using lapply . Should be much faster than match ing and sweep ing.

df[-1] <- lapply(names(df)[-1], \(x) df[, x]*weights[x, ])
#               id          a          b          d EE         f
# 1           this  0.7494769 -0.1743717   5.293633 NA  4.175490
# 2             is  0.9081165  2.7259681   9.884781 NA -8.807325
# 3             an -0.1399082  0.5559418  -4.374990 NA -3.922983
# 4        example -0.5804764  0.7803247 -10.545803 NA  3.467420
# 5     data.frame  0.7755359 -1.1257980 -10.848448 NA  3.161471
# 6            for -2.3269836  1.2253498  12.256612 NA  5.404699
# 7  stackoverflow  0.8065261 -1.3295363  -3.270386 NA -5.228465
# 8         please -0.3535157 -0.4973664   4.212379 NA  2.099392
# 9           help -0.3637245  1.6810690   4.156319 NA  2.157794
# 10            me -1.1767322  0.4480855  -1.102092 NA -4.835156

You can try the base R code below

df[-1] <- df[-1] * weights[names(df)[-1], "V1"][col(df[-1])]

which gives

> df
              id          a          b          d EE         f
1           this  0.7494769 -0.1743717   5.293633 NA  4.175490
2             is  0.9081165  2.7259681   9.884781 NA -8.807325
3             an -0.1399082  0.5559418  -4.374990 NA -3.922983
4        example -0.5804764  0.7803247 -10.545803 NA  3.467420
5     data.frame  0.7755359 -1.1257980 -10.848448 NA  3.161471
6            for -2.3269836  1.2253498  12.256612 NA  5.404699
7  stackoverflow  0.8065261 -1.3295363  -3.270386 NA -5.228465
8         please -0.3535157 -0.4973664   4.212379 NA  2.099392
9           help -0.3637245  1.6810690   4.156319 NA  2.157794
10            me -1.1767322  0.4480855  -1.102092 NA -4.835156

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM