简体   繁体   中英

density/frequency and probability in hist()

I have used the code

hist(x, probability=TRUE)

which gives me a y-axis from 0 to 2 with the name density. I dont get what this means. Does it integrate to 1, sum to 1, or what is the y-value equal to? The documentation says "freq = NULL, probability = !freq" but that does not make sense to me. If I dont use probability=TRUE I get Frequency on the y-axis, but the shape of the plot is the same.

You can save your histogram to a variable and take a look at it.

x=rnorm(1000)
h<-hist(x)

在此处输入图片说明

h

$breaks
 [1] -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5  0.0  0.5  1.0  1.5  2.0  2.5  3.0  3.5  4.0

$counts
 [1]   2   8  24  42  87 169 188 189 146  78  38  23   5   0   1

$density
 [1] 0.004 0.016 0.048 0.084 0.174 0.338 0.376 0.378 0.292 0.156 0.076 0.046 0.010 0.000 0.002

$mids
 [1] -3.25 -2.75 -2.25 -1.75 -1.25 -0.75 -0.25  0.25  0.75  1.25  1.75  2.25  2.75  3.25  3.75

$xname
[1] "x"

$equidist
[1] TRUE

attr(,"class")
[1] "histogram"

By default it plots frequency (can be accessed via h$counts), which is just the number of points that get within each interval. Total amount of points is equal to the length of the vector, which you can check with

sum(h$counts)
[1] 1000

If you specify probability=TRUE , it will plot the probability of each point getting within each interval. Total sum of probabilities times the width of the bar should be equal to 1. In our case, bar width is 0.5, so we get

sum(h$density*0.5)
[1] 1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM