简体   繁体   中英

How to Calculate the percent difference in a three dimensional array using R apply()

i would like to investigate a naive assumption by calculating the percent difference between cells in a three dimensional array with apply() in R. However, not getting the correct solution.

Located below is the percent difference that i would like to apply to each cell in the following HairEyeColor dataset in the "datasets" package.

the general expression for the the percent difference is such.

100%*(1-Female/Male)

library

library(vcd)
library(DescTools)

data("HairEyeColor",package="datasets") # load data

str(HairEyeColor) #show structure


a <- HairEyeColor[,,1] # male table
b <- HairEyeColor[,,2] # female table

bind the arrays together 1 for males and 2 for females.

HEC1=Abind(Male=a, Female=b, along=3)

output

, , Male

        Brown Blue Hazel Green
Black    32   11    10     3
Brown    53   50    25    15
Red      10   10     7     7
Blond     3   30     5     8

, , Female

        Brown Blue Hazel Green
Black    36    9     5     2
Brown    66   34    29    14
Red      16    7     7     7
Blond     4   64     5     8

A Contributor's note: The prop.table() command yields the percentages, calculated using the total cases as 100%. For conditional relative frequencies use the margin argument. Using apply() then allows us to perform any calculation along any dimension or combination of dimensions in an array (or matrix).

apply(prop.table(HEC1), c(1,2), diff)

here is the output, however it is not the precent difference.

         Brown         Blue        Hazel        Green
Black 0.006756757 -0.003378378 -0.008445946 -0.001689189
Brown 0.021959459 -0.027027027  0.006756757 -0.001689189
Red   0.010135135 -0.005067568  0.000000000  0.000000000
Blond 0.001689189  0.057432432  0.000000000  0.000000000

i also tried this, with no success.

apply(prop.table(HEC1), c(1,1,2), diff)

here's is what one would expect for the percent difference between male vs female black and brown colors. It would be

(HEC1[1,1,1] - HEC1[1,1,2])*100/HEC1[1,1,1]

output

-12.5

desired table

       Brown    Blue       Hazel    Green
Black   -12.5    18.1        50      33.3
Brown   -24.5    32        -16     6.6
Red   -60     3           0       0
Blond   -33.3   -113.3      0         0

Any suggestions are helpful. Thanks

i researched this question ,however it is refering to time series in long format How to calculate percent differences in a table in R

You should always provide reproducible data. Printouts can leave out important details regarding the structure. Use dput(HEC1) and copy the results into your question like this:

HEC1 <- structure(c(32L, 53L, 10L, 3L, 11L, 50L, 10L, 30L, 10L, 25L, 
7L, 5L, 3L, 15L, 7L, 8L, 36L, 66L, 16L, 4L, 9L, 34L, 7L, 64L, 
5L, 29L, 7L, 5L, 2L, 14L, 7L, 8L), .Dim = c(4L, 4L, 2L), .Dimnames = list(
    c("Black", "Brown", "Red", "Blond"), c("Brown", "Blue", "Hazel", 
    "Green"), c("Male", "Female")))

Now we need to be clear on what you are computing. Your desired percentage was based on the total number of brown-eyed, black haired individuals, but this is misleading since the sample sizes of males and females is not the same:

margin.table(HEC1, 3)
  Male Female 
   279    313 

If instead we compute the tables separately and subtract them:

Male.pct <- prop.table(HEC1[, , 1]) * 100
round(Male.pct, 2)
#       Brown  Blue Hazel Green
# Black 11.47  3.94  3.58  1.08
# Brown 19.00 17.92  8.96  5.38
# Red    3.58  3.58  2.51  2.51
# Blond  1.08 10.75  1.79  2.87

Female.pct <- prop.table(HEC1[, , 2]) * 100
round(Female.pct, 2)
#       Brown  Blue Hazel Green
# Black 11.50  2.88  1.60  0.64
# Brown 21.09 10.86  9.27  4.47
# Red    5.11  2.24  2.24  2.24
# Blond  1.28 20.45  1.60  2.56

round(Male.pct - Female.pct, 2)
#       Brown  Blue Hazel Green
# Black -0.03  1.07  1.99  0.44
# Brown -2.09  7.06 -0.30  0.90
# Red   -1.53  1.35  0.27  0.27
# Blond -0.20 -9.69  0.19  0.31

Your original apply function computed the proportions a third way, over the entire array combining both tables. You should have used:

apply(prop.table(HEC1, 3), 1:2, diff) * 100
#            Brown      Blue      Hazel      Green
# Black 0.03206339 -1.067253 -1.9867853 -0.4362912
# Brown 2.08984621 -7.058527  0.3046022 -0.9035006
# Red   1.52759170 -1.347808 -0.2725388 -0.2725388
# Blond 0.20268645  9.694596 -0.1946706 -0.3114730 

The diff function subtracts the second value from the first (Female - Male) so the negative signs are reversed compared to Male - Female above.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM