简体   繁体   中英

Matlab plot in histogram

Assume y is a vector with random numbers following the distribution f(x)=sqrt(4-x^2)/(2*pi) . At the moment I use the command hist(y,30) . How can I plot the distribution function f(x)=sqrt(4-x^2)/(2*pi) into the same histogram?

Let's take an example of another distribution function, the standard normal. To do exactly what you say you want, you do this:

nRand = 10000;
y = randn(1,nRand);
[myHist, bins] = hist(y,30);
pdf = normpdf(bins);
figure, bar(bins, myHist,1); hold on; plot(bins,pdf,'rx-'); hold off;

This is probably NOT what you actually want though. Why? You'll notice that your density function looks like a thin line at the bottom of your histogram plot. This is because a histogram is counts of numbers in bins, while a density function is normalized to integrate to one. If you have hundreds of items in a bin, there is no way that the density function will match that in scale, so you have a scaling or normalization problem. Either you have to normalize the histogram, or plot a scaled distribution function. I prefer to scale the distribution function so that my counts are sensical when I look at the histogram:

normalizedpdf = pdf/sum(pdf)*sum(myHist);
figure, bar(bins, myHist,1); hold on; plot(bins,normalizedpdf,'rx-'); hold off;

Your case is the same, except you'll use the function f(x) you specified instead of the normpdf command.

Instead of normalizing numerically, you could also do it by finding a theoretical scaling factor as follows.

nbins = 30;
nsamples = max(size(y));
binsize = (max(y)-min(y)) / nsamples
hist(y,nbins)
hold on
x1=linspace(min(y),max(y),100);
scalefactor = nsamples * binsize 
y1=scalefactor * sqrt(4-x^2)/(2*pi)
plot(x1,y1)

Update: How it works.

For any dataset that is large enough to give a good approximation to the pdf (call it f(x)), the integral of f(x) over this domain will be approximately unity. However we know that the area under any histogram is precisely equal to the total number of samples times the bin-width.

So a very simple scale factor to bring the pdf into line with the histogram is Ns*Wb, the total number of sample point times the width of the bins.

Let me add another example to the mix:

%# some normally distributed random data
data = randn(1e3,1);

%# histogram
numbins = 30;
hist(data, numbins);
h(1) = get(gca,'Children');
set(h(1), 'FaceColor',[.8 .8 1])

%# figure out how to scale the pdf (with area = 1), to the area of the histogram
[bincounts,binpos] = hist(data, numbins);
binwidth = binpos(2) - binpos(1);
histarea = binwidth*sum(bincounts);

%# fit a gaussian
[muhat,sigmahat] = normfit(data);
x = linspace(binpos(1),binpos(end),100);
y = normpdf(x, muhat, sigmahat);
h(2) = line(x, y*histarea, 'Color','b', 'LineWidth',2);

%# kernel estimator
[f,x,u] = ksdensity( data );
h(3) = line(x, f*histarea, 'Color','r', 'LineWidth',2);

legend(h, {'freq hist','fitted Gaussian','kernel estimator'})

历史

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM