简体   繁体   English

ggplot2:条形图中的 y 轴缩放问题

[英]ggplot2: issue with y-axis scaling in bar chart

I wanne create a bar chart using the following code:我想使用以下代码创建一个条形图:

ggplot(data_set, aes(x=reorder(regionname, +gdpcap), y=gdpcap)) +
  geom_bar(stat="identity")

As seen in the code, the y-axis doesn't simply display the count but the mean of the variable 'gdpcap' for each category on the x-axis.如代码所示,y 轴不只是显示计数,而是 x 轴上每个类别的变量“gdpcap”的平均值。

In the dataset, the values for the variable 'gdpcap' are continuous and only range from 1 to 10. But in the graphic output of my code, the values on the y-axis are multiplied by ten and display '0', '20', '40', '60' instead of just '0, 2, 4, 6'.在数据集中,变量 'gdpcap' 的值是连续的,只有 1 到 10 的范围。但在我的代码的图形输出中,y 轴上的值乘以 10 并显示 '0'、'20 ', '40', '60' 而不仅仅是 '0, 2, 4, 6'。 It can be seen in the attached picture: bar chart可以在附图中看到:条形图

Why is ggplot scaling my y-axis differently?为什么 ggplot 以不同的方式缩放我的 y 轴? If I would just calculate the mean, using the following code...如果我只想计算平均值,请使用以下代码...

mean(data_set$gdpcap)

...then the output would be 3.681204 and not 30.681204. ...那么输出将是 3.681204 而不是 30.681204。 So something in the ggplot command is causing the issue I assume.所以 ggplot 命令中的某些内容导致了我假设的问题。 Any ideas?有任何想法吗?

Thanks for any help!谢谢你的帮助!

Thomas托马斯

As seen in the code, the y-axis doesn't simply display the count but the mean of the variable 'gdpcap' for each category on the x-axis.如代码所示,y 轴不只是显示计数,而是 x 轴上每个类别的变量“gdpcap”的平均值。

No, that's not the case.不,事实并非如此。 Nothing in your code tells ggplot to take a mean.您的代码中没有任何内容告诉ggplot取平均值。 stat = "identity" means the y-values are taken as-is. stat = "identity"表示 y 值按原样处理。 The default position = "stack" means little bars for each observation are stacked on top of each other, so what you're seeing is the sum.默认position = "stack"意味着每个观察的小条相互堆叠,所以你看到的是总和。

There's not a built-in way to plot means using geom_bar , but we can use stat_summary to have ggplot do the calculation.没有内置的绘图方法使用geom_bar ,但我们可以使用stat_summaryggplot进行计算。 If it were me though, I'd use dplyr for the data transformation (that's what it's for) and ggplot for the plotting (that's what it's for).如果是我,我会使用dplyr进行数据转换(这就是它的用途)和ggplot用于绘图(这就是它的用途)。

Here it is both ways, demonstrated on the built-in mtcars data:这里有两种方式,在内置的mtcars数据上演示:

ggplot(mtcars, aes(x = factor(cyl), y = mpg)) +
  stat_summary(geom = "col", fun = mean)


mtcars %>%
  group_by(cyl) %>%
  summarize(mean_mpg = mean(mpg)) %>%
  ggplot(aes(x = factor(cyl), y = mpg)) +
  geom_col()

(I've switched to geom_col , which is equivalent to geom_bar(stat = "identity") .) (我已经切换到geom_col ,它相当于geom_bar(stat = "identity") 。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM