i would like to investigate a naive assumption by calculating the percent difference between cells in a three dimensional array with apply() in R. However, not getting the correct solution.
Located below is the percent difference that i would like to apply to each cell in the following HairEyeColor dataset in the "datasets" package.
the general expression for the the percent difference is such.
100%*(1-Female/Male)
library
library(vcd)
library(DescTools)
data("HairEyeColor",package="datasets") # load data
str(HairEyeColor) #show structure
a <- HairEyeColor[,,1] # male table
b <- HairEyeColor[,,2] # female table
bind the arrays together 1 for males and 2 for females.
HEC1=Abind(Male=a, Female=b, along=3)
output
, , Male
Brown Blue Hazel Green
Black 32 11 10 3
Brown 53 50 25 15
Red 10 10 7 7
Blond 3 30 5 8
, , Female
Brown Blue Hazel Green
Black 36 9 5 2
Brown 66 34 29 14
Red 16 7 7 7
Blond 4 64 5 8
A Contributor's note: The prop.table() command yields the percentages, calculated using the total cases as 100%. For conditional relative frequencies use the margin argument. Using apply() then allows us to perform any calculation along any dimension or combination of dimensions in an array (or matrix).
apply(prop.table(HEC1), c(1,2), diff)
here is the output, however it is not the precent difference.
Brown Blue Hazel Green
Black 0.006756757 -0.003378378 -0.008445946 -0.001689189
Brown 0.021959459 -0.027027027 0.006756757 -0.001689189
Red 0.010135135 -0.005067568 0.000000000 0.000000000
Blond 0.001689189 0.057432432 0.000000000 0.000000000
i also tried this, with no success.
apply(prop.table(HEC1), c(1,1,2), diff)
here's is what one would expect for the percent difference between male vs female black and brown colors. It would be
(HEC1[1,1,1] - HEC1[1,1,2])*100/HEC1[1,1,1]
output
-12.5
desired table
Brown Blue Hazel Green
Black -12.5 18.1 50 33.3
Brown -24.5 32 -16 6.6
Red -60 3 0 0
Blond -33.3 -113.3 0 0
Any suggestions are helpful. Thanks
i researched this question ,however it is refering to time series in long format How to calculate percent differences in a table in R
You should always provide reproducible data. Printouts can leave out important details regarding the structure. Use dput(HEC1)
and copy the results into your question like this:
HEC1 <- structure(c(32L, 53L, 10L, 3L, 11L, 50L, 10L, 30L, 10L, 25L,
7L, 5L, 3L, 15L, 7L, 8L, 36L, 66L, 16L, 4L, 9L, 34L, 7L, 64L,
5L, 29L, 7L, 5L, 2L, 14L, 7L, 8L), .Dim = c(4L, 4L, 2L), .Dimnames = list(
c("Black", "Brown", "Red", "Blond"), c("Brown", "Blue", "Hazel",
"Green"), c("Male", "Female")))
Now we need to be clear on what you are computing. Your desired percentage was based on the total number of brown-eyed, black haired individuals, but this is misleading since the sample sizes of males and females is not the same:
margin.table(HEC1, 3)
Male Female
279 313
If instead we compute the tables separately and subtract them:
Male.pct <- prop.table(HEC1[, , 1]) * 100
round(Male.pct, 2)
# Brown Blue Hazel Green
# Black 11.47 3.94 3.58 1.08
# Brown 19.00 17.92 8.96 5.38
# Red 3.58 3.58 2.51 2.51
# Blond 1.08 10.75 1.79 2.87
Female.pct <- prop.table(HEC1[, , 2]) * 100
round(Female.pct, 2)
# Brown Blue Hazel Green
# Black 11.50 2.88 1.60 0.64
# Brown 21.09 10.86 9.27 4.47
# Red 5.11 2.24 2.24 2.24
# Blond 1.28 20.45 1.60 2.56
round(Male.pct - Female.pct, 2)
# Brown Blue Hazel Green
# Black -0.03 1.07 1.99 0.44
# Brown -2.09 7.06 -0.30 0.90
# Red -1.53 1.35 0.27 0.27
# Blond -0.20 -9.69 0.19 0.31
Your original apply
function computed the proportions a third way, over the entire array combining both tables. You should have used:
apply(prop.table(HEC1, 3), 1:2, diff) * 100
# Brown Blue Hazel Green
# Black 0.03206339 -1.067253 -1.9867853 -0.4362912
# Brown 2.08984621 -7.058527 0.3046022 -0.9035006
# Red 1.52759170 -1.347808 -0.2725388 -0.2725388
# Blond 0.20268645 9.694596 -0.1946706 -0.3114730
The diff
function subtracts the second value from the first (Female - Male) so the negative signs are reversed compared to Male - Female above.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.