简体   繁体   中英

R: Divide rows by row totals using dplyr

I have seen a variety of posts that detail how to do a similar function, but I have yet to find one that divides all rows by a reference row while excluding the reference row itself.

Here is an example set of data with required packages:

library(tidyverse)
library(janitor)

d <- tibble(
  level = as.factor(c(1:10)),
  var_1 = sample(c(1:20), 10),
  var_2 = sample(c(1:30), 10),
  var_3 = sample(c(1:40), 10),
  var_4 = sample(c(1:50), 10),
)

In the following code, I am dividing each row by the Total row generated by adorn_totals() :

d %>%
  adorn_totals("row") %>%
  mutate_at(vars(-level), funs(round(./.[11]*100, 2)))

Here is the output:

level  var_1  var_2 var_3  var_4
    1   3.66  13.89   6.0   6.50
    2  10.98  11.11   0.5   8.94
    3   4.88   7.64  14.0  15.45
    4   6.10  18.06  16.0   7.72
    5  18.29  13.19  10.0   9.35
    6  14.63  10.42  11.5   3.25
    7   2.44   6.25  12.5  19.51
    8   8.54  11.81  13.5   4.07
    9  23.17   3.47   1.0  20.33
   10   7.32   4.17  15.0   4.88
Total 100.00 100.00 100.0 100.00

But, I want to calculate these proportions of the total without impacting the Total row itself. Below I've attached the desired output with the row totals untouched while the rest of the rows have been mutated per my function.

level  var_1  var_2 var_3  var_4
    1   3.66  13.89   6.0   6.50
    2  10.98  11.11   0.5   8.94
    3   4.88   7.64  14.0  15.45
    4   6.10  18.06  16.0   7.72
    5  18.29  13.19  10.0   9.35
    6  14.63  10.42  11.5   3.25
    7   2.44   6.25  12.5  19.51
    8   8.54  11.81  13.5   4.07
    9  23.17   3.47   1.0  20.33
   10   7.32   4.17  15.0   4.88
Total  82    144    200   246

Thanks for your help!

We can use replace here. Here, the n() gives the index of the last row and -n() removes the last row from the calculation. In replace , the index vector argument ( list ) can take a logical or numeric index

library(dplyr)
library(janitor)
d %>%
   adorn_totals("row") %>%
   mutate_at(vars(-level), list(~replace(., row_number() < n(),
                   round(.[-n()]/.[n()]*100, 2))))

It seems like adorn_percentages does the same thing as your custom function.

d %>% 
  adorn_percentages("col") %>% 
  mutate_at(vars(-level), ~round(.*100,2)) %>% 
  bind_rows(
    d %>% adorn_totals("row") %>% slice(11)
  )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM