[英]ggplot2: issue with y-axis scaling in bar chart
I wanne create a bar chart using the following code:我想使用以下代码创建一个条形图:
ggplot(data_set, aes(x=reorder(regionname, +gdpcap), y=gdpcap)) +
geom_bar(stat="identity")
As seen in the code, the y-axis doesn't simply display the count but the mean of the variable 'gdpcap' for each category on the x-axis.如代码所示,y 轴不只是显示计数,而是 x 轴上每个类别的变量“gdpcap”的平均值。
In the dataset, the values for the variable 'gdpcap' are continuous and only range from 1 to 10. But in the graphic output of my code, the values on the y-axis are multiplied by ten and display '0', '20', '40', '60' instead of just '0, 2, 4, 6'.在数据集中,变量 'gdpcap' 的值是连续的,只有 1 到 10 的范围。但在我的代码的图形输出中,y 轴上的值乘以 10 并显示 '0'、'20 ', '40', '60' 而不仅仅是 '0, 2, 4, 6'。 It can be seen in the attached picture: bar chart可以在附图中看到:条形图
Why is ggplot scaling my y-axis differently?为什么 ggplot 以不同的方式缩放我的 y 轴? If I would just calculate the mean, using the following code...如果我只想计算平均值,请使用以下代码...
mean(data_set$gdpcap)
...then the output would be 3.681204 and not 30.681204. ...那么输出将是 3.681204 而不是 30.681204。 So something in the ggplot command is causing the issue I assume.所以 ggplot 命令中的某些内容导致了我假设的问题。 Any ideas?有任何想法吗?
Thanks for any help!谢谢你的帮助!
Thomas托马斯
As seen in the code, the y-axis doesn't simply display the count but the mean of the variable 'gdpcap' for each category on the x-axis.如代码所示,y 轴不只是显示计数,而是 x 轴上每个类别的变量“gdpcap”的平均值。
No, that's not the case.不,事实并非如此。 Nothing in your code tells ggplot
to take a mean.您的代码中没有任何内容告诉ggplot
取平均值。 stat = "identity"
means the y-values are taken as-is. stat = "identity"
表示 y 值按原样处理。 The default position = "stack"
means little bars for each observation are stacked on top of each other, so what you're seeing is the sum.默认position = "stack"
意味着每个观察的小条相互堆叠,所以你看到的是总和。
There's not a built-in way to plot means using geom_bar
, but we can use stat_summary
to have ggplot
do the calculation.没有内置的绘图方法使用geom_bar
,但我们可以使用stat_summary
让ggplot
进行计算。 If it were me though, I'd use dplyr
for the data transformation (that's what it's for) and ggplot
for the plotting (that's what it's for).如果是我,我会使用dplyr
进行数据转换(这就是它的用途)和ggplot
用于绘图(这就是它的用途)。
Here it is both ways, demonstrated on the built-in mtcars
data:这里有两种方式,在内置的mtcars
数据上演示:
ggplot(mtcars, aes(x = factor(cyl), y = mpg)) +
stat_summary(geom = "col", fun = mean)
mtcars %>%
group_by(cyl) %>%
summarize(mean_mpg = mean(mpg)) %>%
ggplot(aes(x = factor(cyl), y = mpg)) +
geom_col()
(I've switched to geom_col
, which is equivalent to geom_bar(stat = "identity")
.) (我已经切换到geom_col
,它相当于geom_bar(stat = "identity")
。)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.