简体   繁体   中英

Calculating the Upper and lower limits of boxplot statistic (i.e. end of whiskers)

I'm trying to verify the upper and lower limits of the boxplot statistics (ie the end of the whiskers) by comparing it to the formula, Q3+(1.5 IQR) and Q1-(1.5 IQR).

Each time I iterate the following code, it always returns a small difference between the boxplot statistic and the formula.

Shouldn't these numbers be identical? Why the deviation?

# random normal distribution
df <- rnorm(500)
# convert to dataframe
df <- as.data.frame(df)
# boxplot statistics
s <- boxplot.stats(df$df)
s$stats
# Upper limit of whisker: Q3+(1.5*IQR)
s$stats[4]+(1.5*(IQR(df$df)))
# Lower limit of whisker: Q1-(1.5*IQR)
s$stats[2]-(1.5*(IQR(df$df)))

The whiskers extend out to the data that is at or inside Q3+(1.5*IQR) . Meaning, go out to Q3*(1.5*IQR) , and then pull it back until it hits data.

We can find those values with:

set.seed(42)
vec <- rnorm(500)
st <- boxplot.stats(vec)
st$stats
# [1] -2.46133548 -0.66263842 -0.03797064  0.63573211  2.45959355


###       ,--- data
###       |   ,--- that is at or inside
###       |  |      ,--- this number
###      ,-, v ,----^---------------------,
max(vec[ vec < st$stats[4]+(1.5*(IQR(vec))) ])
# [1] 2.459594

min(vec[ vec > st$stats[2]-(1.5*(IQR(vec))) ])
# [1] -2.461335

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM