简体   繁体   English

在 ggplot2 中控制小提琴 plot 的 x 轴

[英]control x axis of a violin plot in ggplot2

I'm generating violin plots in ggplot2 for a time series, year_1 to year_32.我在 ggplot2 中为时间序列 year_1 到 year_32 生成小提琴图。 The years in my df are stored as numerical values.我的 df 中的年份存储为数值。 From the examples I've seen, it seems that I must convert these numerical year values to factors to plot one violin per year;从我看到的例子来看,似乎我必须将这些数字年份值转换为 plot 每年一把小提琴; and in fact, if I run the code without as.factors, I get one big fat violin.事实上,如果我在没有 as.factors 的情况下运行代码,我会得到一把大小提琴。 I would like to understand why geom_violin can't have numeric values on the x axis;我想了解为什么 geom_violin 在 x 轴上不能有数值; or if I'm wrong about that, how to use them?或者如果我错了,如何使用它们?

So:所以:

my_data$year <- as.factor(my_data$year)

p <- ggplot(data = my_data, aes(x = year, y = continuous_var)+
 geom_violin(fill = "#FF0000", color = "#000000")+
 ylim(0,500)+
 labs(x = "x_label", y = "y_label")

p +my_theme()

works fine, but if I skip工作正常,但如果我跳过

my_data$year <- as.factor(my_data$year)

it doesn't work, I get one big fat violin for all years.它不起作用,我多年来都得到了一把大而肥的小提琴。 Why?为什么?

TIA TIA

You miss a ) at the end of this line p <- ggplot(data = my_data, aes(x = year, y = continuous_var)你错过了这一行末尾的 a ) p <- ggplot(data = my_data, aes(x = year, y = continuous_var)

I have construced a reproducible example with the ToothGrowth dataset: This should work now:我用ToothGrowth数据集构建了一个可重现的示例:这现在应该可以工作了:

library(ggplot2)
my_data <- ToothGrowth

my_data$dose <- as.factor(my_data$dose)

p <- ggplot(data = my_data, aes(x = dose, y = len))+
              geom_violin(fill = "#FF0000", color = "#000000")+
              ylim(0,500)+
              labs(x = "x_label", y = "y_label") +
              theme_bw()
p

在此处输入图像描述

PS: this discussion would better fit Cross Validated, as it's more of an statistics than coding question. PS:这个讨论更适合交叉验证,因为它更多的是统计而不是编码问题。

I'm not 100% sure, but here's my explanation: the violin plot shows the density for a set of data, you can divide your data into groups so that you can plot one violin for each part of your data.我不确定 100%,但这是我的解释:小提琴 plot 显示了一组数据的密度,您可以将数据分组,以便您可以 plot 为您的数据的每个部分使用一把小提琴。 But if the metric you're using to divide groups (x axis) is a continuous, you're going to have infinite groupings (one group for the values at 0, one for 0.1, one for 0.01, etc.), so in the end you actually can't divide your data, and ggplot probably ignores the x variable and makes one violin for all your data.但是,如果您用来划分组的度量(x 轴)是连续的,那么您将有无限的分组(一组用于 0 的值,一组用于 0.1,一组用于 0.01,等等),所以在最后,您实际上无法划分数据,而 ggplot 可能会忽略 x 变量并为您的所有数据制作一把小提琴。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM