简体   繁体   English

R - 按分位数绘制的直方图颜色

[英]R - hist plot colours by quantile

I am trying to do a simple hist plot and colour the bins by quantile .我正在尝试做一个简单的hist plot 并按quantile为 bin 着色。

I was wondering why when the bins size change the colours gets all messed up.我想知道为什么当垃圾箱大小改变时,颜色会变得一团糟。 Maybe I am not doing it right from the beginning.也许我从一开始就没有做对。

The quantiles are分位数是

quantile(x)
    0%    25%    50%    75%   100% 
    0.00  33.75  58.00  78.25 123.00 

Then I am setting the colours with the quantile values然后我用分位数值设置颜色

k = ifelse(test = x <= 34, yes = "#8DD3C7", 
no = ifelse(test = (x > 34 & x <= 58), yes =  "#FFFFB3", 
no = ifelse(test = (x > 58 & x <= 79), yes = "#BEBADA", 
no = ifelse(test = (x > 79), yes = "#FB8072", 'grey'))) )                      

Then when I plot with larger bin, I get :然后当我用更大的 bin 绘图时,我得到:

hist(dt, breaks = 10, col = k) 

在此处输入图片说明

Which seems right, even though the last bin is wrong (?!).这似乎是正确的,即使最后一个 bin 是错误的(?!)。

But when I try with smaller bins, the colours are not right.但是当我尝试使用较小的垃圾箱时,颜色不正确。

在此处输入图片说明

Could someone help me understand why is it wrong ?有人能帮我理解为什么是错的吗? Or if my code is wrong ?或者如果我的代码是错误的?

The x in question有问题的x

x = c(23, 23, 16, 16, 34, 34, 43, 43, 97, 97, 63, 63, 39, 39, 29, 
29, 63, 63, 48, 48, 7, 7, 80, 80, 69, 69, 110, 110, 103, 103, 
43, 43, 39, 39, 46, 46, 14, 14, 56, 56, 76, 76, 52, 52, 18, 18, 
32, 32, 66, 66, 70, 70, 26, 26, 40, 40, 105, 105, 62, 62, 51, 
51, 58, 58, 37, 37, 55, 55, 42, 42, 11, 11, 89, 89, 55, 55, 109, 
109, 49, 49, 27, 27, 96, 96, 27, 27, 65, 65, 74, 74, 17, 17, 
33, 33, 89, 89, 63, 63, 18, 18, 25, 25, 36, 36, 108, 108, 3, 
3, 52, 52, 83, 83, 74, 74, 56, 56, 99, 99, 6, 6, 25, 25, 51, 
51, 4, 4, 100, 100, 17, 17, 44, 44, 23, 23, 70, 70, 85, 85, 14, 
14, 22, 22, 89, 89, 45, 45, 2, 2, 29, 29, 14, 14, 69, 69, 96, 
96, 10, 10, 58, 58, 97, 97, 54, 54, 60, 60, 65, 65, 2, 2, 54, 
54, 4, 4, 28, 28, 107, 107, 74, 74, 72, 72, 71, 71, 42, 42, 92, 
92, 64, 64, 39, 39, 111, 111, 72, 72, 73, 73, 58, 58, 41, 41, 
56, 56, 73, 73, 18, 18, 73, 73, 36, 36, 60, 60, 49, 49, 47, 47, 
95, 95, 19, 19, 8, 8, 7, 7, 38, 38, 38, 38, 38, 38, 28, 28, 79, 
79, 53, 53, 30, 30, 19, 19, 14, 14, 53, 53, 68, 68, 39, 39, 42, 
42, 87, 87, 33, 33, 18, 18, 77, 77, 83, 83, 19, 19, 14, 14, 7, 
7, 32, 32, 94, 94, 30, 30, 55, 55, 89, 89, 30, 30, 45, 45, 84, 
84, 38, 38, 59, 59, 73, 73, 77, 77, 22, 22, 55, 55, 31, 31, 52, 
52, 20, 20, 26, 26, 62, 62, 55, 55, 46, 46, 26, 26, 49, 49, 22, 
22, 65, 65, 67, 67, 73, 73, 29, 29, 88, 88, 86, 86, 76, 76, 32, 
32, 12, 12, 19, 19, 14, 14, 8, 8, 63, 63, 63, 63, 65, 65, 84, 
84, 34, 34, 42, 42, 26, 26, 75, 75, 68, 68, 28, 28, 95, 95, 17, 
17, 76, 76, 33, 33, 91, 91, 93, 93, 80, 80, 89, 89, 64, 64, 81, 
81, 98, 98, 47, 47, 70, 70, 46, 46, 11, 11, 92, 92, 69, 69, 95, 
95, 51, 51, 87, 87, 61, 61, 50, 50, 47, 47, 35, 35, 31, 31, 39, 
39, 19, 19, 81, 81, 35, 35, 68, 68, 68, 68, 67, 67, 57, 57, 7, 
7, 9, 9, 23, 23, 50, 50, 89, 89, 41, 41, 54, 54, 53, 53, 57, 
57, 89, 89, 32, 32, 40, 40, 48, 48, 35, 35, 15, 15, 90, 90, 1, 
1, 17, 17, 53, 53, 73, 73, 76, 76, 59, 59, 45, 45, 68, 68, 21, 
21, 37, 37, 33, 33, 51, 51, 61, 61, 31, 31, 15, 15, 23, 23, 29, 
29, 45, 45, 96, 96, 87, 87, 37, 37, 104, 104, 50, 50, 58, 58, 
103, 103, 91, 91, 72, 72, 73, 73, 27, 27, 60, 60, 23, 23, 99, 
99, 28, 28, 78, 78, 27, 27, 82, 82, 63, 63, 34, 34, 84, 84, 62, 
62, 2, 2, 99, 99, 22, 22, 85, 85, 39, 39, 47, 47, 66, 66, 17, 
17, 74, 74, 45, 45, 70, 70, 87, 87, 28, 28, 97, 97, 89, 89, 33, 
33, 50, 50, 79, 79, 86, 86, 69, 69, 91, 91, 75, 75, 52, 52, 76, 
76, 13, 13, 71, 71, 42, 42, 20, 20, 28, 28, 56, 56, 69, 69, 16, 
16, 47, 47, 60, 60, 45, 45, 72, 72, 78, 78, 107, 107, 4, 4, 64, 
64, 88, 88, 9, 9, 3, 3, 10, 10, 92, 92, 41, 41, 5, 5, 35, 35, 
31, 31, 24, 24, 70, 70, 47, 47, 41, 41, 32, 32, 92, 92, 90, 90, 
75, 75, 3, 3, 78, 78, 30, 30, 93, 93, 60, 60, 17, 17, 25, 25, 
48, 48, 70, 70, 69, 69, 66, 66, 76, 76, 104, 104, 31, 31, 72, 
72, 56, 56, 64, 64, 92, 92, 68, 68, 102, 102, 100, 100, 27, 27, 
40, 40, 47, 47, 29, 29, 76, 76, 78, 78, 20, 20, 13, 13, 10, 10, 
113, 113, 17, 17, 61, 61, 69, 69, 65, 65, 16, 16, 100, 100, 5, 
5, 18, 18, 24, 24, 54, 54, 41, 41, 64, 64, 66, 66, 90, 90, 29, 
29, 97, 97, 37, 37, 42, 42, 84, 84, 37, 37, 74, 74, 65, 65, 12, 
12, 49, 49, 31, 31, 108, 108, 9, 9, 93, 93, 71, 71, 39, 39, 70, 
70, 79, 79, 92, 92, 60, 60, 104, 104, 79, 79, 103, 103, 38, 38, 
93, 93, 46, 46, 66, 66, 79, 79, 51, 51, 31, 31, 65, 65, 93, 93, 
25, 25, 22, 22, 91, 91, 123, 123, 51, 51, 34, 34, 64, 64, 31, 
31, 24, 24, 74, 74, 57, 57, 95, 95, 83, 83, 28, 28, 56, 56, 72, 
72, 43, 43, 18, 18, 66, 66, 32, 32, 17, 17, 67, 67, 10, 10, 44, 
44, 66, 66, 57, 57, 89, 89, 57, 57, 55, 55, 18, 18, 78, 78, 82, 
82, 103, 103, 110, 110, 92, 92, 54, 54, 35, 35, 8, 8, 53, 53, 
86, 86, 45, 45, 99, 99, 19, 19, 84, 84, 94, 94, 92, 92, 80, 80, 
69, 69, 45, 45, 22, 22, 59, 59, 9, 9, 41, 41, 72, 72, 24, 24, 
117, 117, 79, 79, 57, 57, 29, 29, 96, 96, 47, 47, 23, 23, 64, 
64, 33, 33, 48, 48, 80, 80, 30, 30, 42, 42, 10, 10, 42, 42, 68, 
68, 46, 46, 58, 58, 39, 39, 82, 82, 79, 79, 80, 80, 89, 89, 85, 
85, 24, 24, 106, 106, 40, 40, 90, 90, 69, 69, 92, 92, 84, 84, 
82, 82, 86, 86, 80, 80, 73, 73, 78, 78, 39, 39, 27, 27, 55, 55, 
100, 100, 63, 63, 21, 21, 46, 46, 94, 94, 6, 6, 45, 45, 66, 66, 
94, 94, 52, 52, 78, 78, 59, 59, 86, 86, 67, 67, 76, 76, 54, 54, 
47, 47, 37, 37, 76, 76, 32, 32, 49, 49, 87, 87, 122, 122, 27, 
27, 82, 82, 51, 51, 50, 50, 22, 22, 32, 32, 99, 99, 77, 77, 54, 
54, 29, 29, 82, 82, 80, 80, 85, 85, 30, 30, 57, 57, 41, 41, 50, 
50, 65, 65, 51, 51, 109, 109, 89, 89, 50, 50, 6, 6, 66, 66, 42, 
42, 48, 48, 88, 88, 67, 67, 89, 89, 109, 109, 80, 80, 64, 64, 
64, 64, 95, 95, 76, 76, 76, 76, 78, 78, 44, 44, 51, 51, 19, 19, 
29, 29, 31, 31, 75, 75, 11, 11, 10, 10, 64, 64, 80, 80, 29, 29, 
73, 73, 67, 67, 38, 38, 27, 27, 23, 23, 74, 74, 79, 79, 49, 49, 
78, 78, 29, 29, 59, 59, 70, 70, 8, 8, 24, 24, 39, 39, 80, 80, 
27, 27, 29, 29, 36, 36, 94, 94, 86, 86, 35, 35, 84, 84, 99, 99, 
83, 83, 92, 92, 81, 81, 58, 58, 2, 2, 64, 64, 75, 75, 29, 29, 
53, 53, 58, 58, 11, 11, 38, 38, 83, 83, 108, 108, 86, 86, 56, 
56, 12, 12, 84, 84, 76, 76, 38, 38, 54, 54, 37, 37, 27, 27, 61, 
61, 83, 83, 37, 37, 59, 59, 81, 81, 76, 76, 70, 70, 61, 61, 101, 
101, 77, 77, 68, 68, 74, 74, 83, 83, 70, 70, 93, 93, 53, 53, 
64, 64, 89, 89, 1, 1, 53, 53, 67, 67, 81, 81, 71, 71, 51, 51, 
85, 85, 35, 35, 67, 67, 53, 53, 37, 37, 31, 31, 65, 65, 82, 82, 
47, 47, 60, 60, 81, 81, 21, 21, 94, 94, 75, 75, 92, 92, 113, 
113, 93, 93, 84, 84, 77, 77, 82, 82, 84, 84, 58, 58, 83, 83, 
84, 84, 80, 80, 1, 1, 49, 49, 73, 73, 22, 22, 99, 99, 74, 74, 
28, 28, 33, 33, 74, 74, 91, 91, 83, 83, 70, 70, 99, 99, 69, 69, 
38, 38, 68, 68, 47, 47, 61, 61, 47, 47, 70, 70, 85, 85, 20, 20, 
100, 100, 3, 3, 49, 49, 100, 100, 85, 85, 54, 54, 8, 8, 3, 3, 
47, 47, 46, 46, 45, 45, 27, 27, 87, 87, 20, 20, 24, 24, 51, 51, 
50, 50, 105, 105, 73, 73, 13, 13, 18, 18, 51, 51, 75, 75, 55, 
55, 62, 62, 85, 85, 56, 56, 51, 51, 66, 66, 74, 74, 63, 63, 2, 
2, 81, 81, 85, 85, 19, 19, 16, 16, 83, 83, 36, 36, 79, 79, 63, 
63, 41, 41, 45, 45, 76, 76, 62, 62, 67, 67, 74, 74, 92, 92, 47, 
47, 41, 41, 80, 80, 57, 57, 100, 100, 66, 66, 58, 58, 65, 65, 
59, 59, 20, 20, 54, 54, 10, 10, 79, 79, 64, 64, 106, 106, 44, 
44, 28, 28, 41, 41, 49, 49, 80, 80, 61, 61, 20, 20, 75, 75, 59, 
59, 93, 93, 32, 32, 38, 38, 30, 30, 41, 41, 8, 8, 8, 8, 54, 54, 
56, 56, 83, 83, 81, 81, 77, 77, 42, 42, 59, 59, 11, 11, 21, 21, 
77, 77, 84, 84, 86, 86, 84, 84, 34, 34, 48, 48, 80, 80, 92, 92, 
18, 18, 66, 66, 40, 40, 45, 45, 60, 60, 80, 80, 2, 2, 5, 5, 84, 
84, 66, 66, 70, 70, 70, 70, 95, 95, 62, 62, 0, 0, 67, 67, 61, 
61, 71, 71, 73, 73, 82, 82, 45, 45, 54, 54, 43, 43)

It is because you mistunderstand the col argument of hist .这是因为您误解了histcol参数。

The col argument is a vector where col[i] is the colour of the i th bar of the histogram. col参数是一个向量,其中col[i]是直方图第i的颜色。

Your k vector has one element per element of x , which is many more than the number of bars in the histogram.您的k向量x 的每个元素一个元素,这比直方图中的条数多得多。

In the first case, only the first 13 elements of k are used to colour the bars (in that order), since there are only 13 bars.在第一种情况下,只有k的前 13 个元素用于为条形着色(按此顺序),因为只有 13 个条形。 In the second case, the first n elements of k are used to colour the bars, where n is the number of bars (see how the first 13 bars of the small-bin histogram have the same colour as the first 13 of the first histogram?).在第二种情况下, k的前n元素用于为条形着色,其中n是条形的数量(查看小块直方图的前 13 个条形如何与第一个直方图的前 13 个条形具有相同的颜色?)。

If you want to colour the bars by quantile, you will have to work out how many bars are in each quantile (not how many data points), and create your k like that.如果要按分位数为条形着色,则必须计算每个分位数中有多少条形(而不是多少数据点),然后像这样创建k

To do this, you need to know the histogram breaks - the breakpoints of your bins.要做到这一点,您需要知道直方图的断点 - 箱的断点。 The output of hist returns an object where you can get the breakpoints and so on - see ?hist . hist的输出返回一个对象,您可以在其中获取断点等 - 请参阅?hist

# do the histogram counts to get the break points
#  don't plot yet
h <- hist(x, breaks=20, plot=F) # h$breaks and h$mids

To work out the colour the bar should be, you can use either the starting coordinate of each bar (all but the last element of h$breaks ), the ending coordinate of each bar (all but the first element of h$breaks ) or the midpoint coordinate of each bar ( h$mids ).要确定条形图的颜色,您可以使用每个条形的起始坐标(除h$breaks的最后一个元素之外的所有元素)、每个条形的结束坐标(除h$breaks的第一个元素之外的所有元素)或每个条形的中点坐标 ( h$mids )。 Set your colours like you did above.像上面一样设置颜色。

The findInterval(h$mids, quantile(x), ...) works out which quantile each bar is in (determined by the bar's midpoint); findInterval(h$mids, quantile(x), ...)出每个柱线所在的分位数(由柱线的中点确定); it returns an integer with which interval it is in, or 0 if it's outside (though by definition every bar of the histogram is between the 0th and 100th quantile, so technically your "grey" colour is not ever used).它返回一个整数,它在哪个区间内,如果它在外面,则返回 0(尽管根据定义,直方图的每个条形都在第 0 个和第 100 个分位数之间,因此从技术上讲,您的“灰色”颜色从未使用过)。 rightmost.closed makes sure the 100% quantile value is included in the top-most colour bracket. rightmost.closed确保 100% 分位数值包含在最顶部的颜色括号中。 The cols[findInterval(...)+1] is just a cool/tricksy way to do your ifelse(h$mids <= ..., "$8DD3C7", ifelse(h$mids <= ..., .....)) ; cols[findInterval(...)+1]只是一种很酷/棘手的方式来做你的ifelse(h$mids <= ..., "$8DD3C7", ifelse(h$mids <= ..., .....)) ; you could do it the ifelse way if you prefer.如果你愿意,你可以用ifelse方式来做。

cols <- c('grey', "#8DD3C7", "#FFFFB3", "#BEBADA", "#FB8072")  
k <- cols[findInterval(h$mids, quantile(x), rightmost.closed=T, all.inside=F) + 1]
# plot the histogram with the colours
plot(h, col=k)

Have a look at k - it is only as long as the number of bars in the histogram, rather than as long as the number of datapoints in x .看看k - 它只与直方图中的条形数量一样长,而不是与x中数据点的数量一样长。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM