简体   繁体   中英

R: Overlaying Poisson Distribution over a Histogram of Data

I have a dataset where the observations have a wide range (10,000 to around 21,000,000). I am trying to overlay a Poisson distribution over this data, but the distribution is being outputted incorrectly. I have tried using this code so far:

dat <- read.csv('data.csv', TRUE, ',')

hist(dat,
     main = 'Global Sales of Games in 2010',
     xlab = 'Amount of Copies Sold',
     ylab = 'Counts',
     col = 'palegreen1',
     breaks = 100
)

lam = mean(dat)
t = seq(min(dat), max(dat), length.out = 10000)
lines(t, dpois(t, lambda = lam), col='red', lwd=3)

I have also tried this by generating data from a poisson distribution using rpois, but still run into the same problem.

simulated = rpois(length(dat), lam)
simulated_lam = mean(simulated)
a = seq(min(simulated), max(simulated), length.out = 10000)
hist(simulated)
lines(a, dpois(a, lambda = simulated_lam), col='red', lwd=3)

I have referenced this question here, but can not produce the same results. R: Overlay Poisson distribution over histogram of data

I have images of the resulting output, but can not post it due to this being a new account. If anyone knows an alternative way of posting images, I would glady be able to follow up.

Thanks in advance.

Your code throws some warnings, since you are using dpois(t, lambda = lam) with a t that is not an integer (you can see those warnings by typing warnings() in your console). By changing length.out = 10000 into by = 1 , you force t to consist only of integers, assuming your dat contains only integers.

Below, I made an example that works (in which dat is randomly generated by me). Note that I multiplied the dpois() call by the dataset size to go from densities to counts.

dataset_size <- 100
dat <- rpois(dataset_size, lambda = 10)

hist(dat,
     main = 'Global Sales of Games in 2010',
     xlab = 'Amount of Copies Sold',
     ylab = 'Counts',
     col = 'palegreen1',
     breaks = 100
)

lam = mean(dat)
t = seq(min(dat), max(dat), by = 1)
lines(t, dpois(t, lambda = lam)*dataset_size, col='red', lwd=3)

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM