[英]MATLAB: combining and normalizing histograms with different sample sizes
I have four sets of data, the distribution of which I would like to represent in MATLAB in one figure. 我有四组数据,我希望在一个图中用MATLAB表示。 Current code is:
目前的代码是:
[n1,x1]=hist([dataset1{:}]);
[n2,x2]=hist([dataset2{:}]);
[n3,x3]=hist([dataset3{:}]);
[n4,x4]=hist([dataset4{:}]);
bar(x1,n1,'hist');
hold on; h1=bar(x1,n1,'hist'); set(h1,'facecolor','g')
hold on; h2=bar(x2,n2,'hist'); set(h2,'facecolor','g')
hold on; h3=bar(x3,n3,'hist'); set(h3,'facecolor','g')
hold on; h4=bar(x4,n4,'hist'); set(h4,'facecolor','g')
hold off
My issue is that I have different sampling sizes for each group, dataset1 has an n of 69, dataset2 has an n of 23, dataset3 and dataset4 have n's of 10. So how do I normalize the distributions when representing these three groups together? 我的问题是我对每个组都有不同的采样大小,dataset1的n为69,dataset2的n为23,dataset3和dataset4的n为10.那么在将这三个组表示在一起时,如何规范化分布呢?
Is there some way to..for example..divide the instances in each bin by the sampling for that group? 有没有办法......例如..通过对该组的抽样来分割每个箱子中的实例?
You can normalize your histograms by dividing by the total number of elements: 您可以通过除以元素总数来标准化直方图:
[n1,x1] = histcounts(randn(69,1));
[n2,x2] = histcounts(randn(23,1));
[n3,x3] = histcounts(randn(10,1));
[n4,x4] = histcounts(randn(10,1));
hold on
bar(x4(1:end-1),n4./sum(n4),'histc');
bar(x3(1:end-1),n3./sum(n3),'histc');
bar(x2(1:end-1),n2./sum(n2),'histc');
bar(x1(1:end-1),n1./sum(n1),'histc');
hold off
ax = gca;
set(ax.Children,{'FaceColor'},mat2cell(lines(4),ones(4,1),3))
set(ax.Children,{'FaceAlpha'},repmat({0.7},4,1))
However, as you can see above, you can do some more things to make your code more simple and short: 但是,正如您在上面所看到的,您可以做更多的事情来使您的代码更简单和简短:
hold on
once. hold on
一次。 bar
handles, use the axes
handle. bar
手柄,而是使用axes
手柄。 axes
handle set all properties at one command. axes
手柄在一个命令中设置所有属性。 and as a side note - it's better to use histcounts
. 作为旁注 - 最好使用
histcounts
。
Here is the result: 结果如下:
EDIT: 编辑:
If you want to also plot the pdf line from histfit
, then you can save it first, and then plot it normalized: 如果你还要从
histfit
绘制pdf线,那么你可以先保存它,然后将其标准化:
dataset = {randn(69,1),randn(23,1),randn(10,1),randn(10,1)};
fits = zeros(100,2,numel(dataset));
hold on
for k = numel(dataset):-1:1
total = numel(dataset{k}); % for normalizing
f = histfit(dataset{k}); % draw the histogram and fit
% collect the curve data and normalize it:
fits(:,:,k) = [f(2).XData; f(2).YData./total].';
x = f(1).XData; % collect the bar positions
n = f(1).YData; % collect the bar counts
f.delete % delete the histogram and the fit
bar(x,n./total,'histc'); % plot the bar
end
ax = gca; % get the axis handle
% set all color and transparency for the bars:
set(ax.Children,{'FaceColor'},mat2cell(lines(4),ones(4,1),3))
set(ax.Children,{'FaceAlpha'},repmat({0.7},4,1))
% plot all the curves:
plot(squeeze(fits(:,1,:)),squeeze(fits(:,2,:)),'LineWidth',3)
hold off
Again, there are some other improvements you can introduce to your code: 同样,您可以在代码中引入一些其他改进:
The new result is: 新结果是:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.