[英]What are the aes() values when making a boxplot using the ggplot package?
I'm trying to make a boxplot with the ggplot2
package in r studo.我正在尝试使用 r studio 中的
ggplot2
包制作箱线图。 I've been reading around on past ggplot2 questions but this is just so basic I can't find it covered in detail... I'm bad at using r.我一直在阅读过去的 ggplot2 问题,但这太基本了,我找不到详细介绍......我不擅长使用 r。
This is my very basic code that I'm trying to use but I don't know my x and y values?这是我尝试使用的非常基本的代码,但我不知道我的 x 和 y 值?
ggplot(data, aes(x,y)) + geom_boxplot()
So, my y values are Pearson Coefficents which is either 0-1 but I'm struggling to put that in as a range.所以,我的 y 值是 Pearson Coefficents,它要么是 0-1,但我正在努力把它放在一个范围内。 Then I'm just confused because my x values are just 4 different conditions.
然后我很困惑,因为我的 x 值只是 4 个不同的条件。 Should I use a vector?
我应该使用向量吗? eg
c(drug 6hr, control, drug 24hr, control)
例如
c(drug 6hr, control, drug 24hr, control)
I succesfully made a basic boxplot using boxplot()
but I am using ggplot2
because I want to show every individual value on the plot using jitter
which I have also failed to use.我成功地使用
ggplot2
boxplot()
制作了一个基本的箱线boxplot()
但我使用的是ggplot2
因为我想使用我也未能使用的jitter
来显示图中的每个单独的值。
Sorry I have only been using R for about 6 months!抱歉,我只使用 R 大约 6 个月! Trying to learn as much as I can.
尽可能多地学习。
My data:我的数据:
drug 6hr, control, drug 24hr, control
0.876 0.707 0.709 0.521
0.084 0.275 0.468 0.795
0.911 0.985 0.565 0.150
0.503 0.584 0.693 0.766
0.363 0.102 0.775 0.640
0.219 0.888 0.724 0.516
0.041 0.277 0.877 0.216
0.206 0.974 0.771 0.434
0.787 0.725 0.671 0.916
0.896 0.873 0.443 0.693
0.396 0.641 0.525 0.471
0.250 0.184 0.467 0.537
0.094 0.453 0.641 0.910
0.750 0.748 0.634 0.007
0.026 0.263 0.069 0.725
0.109 0.227 0.535
0.780 0.811 0.241
0.710 0.568 0.029
0.676 0.114 0.237
0.610 0.260 0.241
0.170 0.728 0.405
0.025 0.815 0.914
0.022 0.329 0.766
0.039 0.714
0.034 0.096
0.402 0.988
0.649
0.564
0.190
0.844
0.920
0.744
0.871
0.565
You need to reshape your dataframe into a longer format and then it will makes things easier forg etting your boxplot
with ggplot2
.您需要将您的数据
ggplot2
重塑为更长的格式,然后它会让事情变得更容易ggplot2
使用ggplot2
boxplot
。
Here, I'm using pivot_longer
function from tidyr
package to transform your data into two columns with the first one being the name of the condition and the second one contains values:在这里,我使用
pivot_longer
功能从tidyr
包到您的数据与第一个是条件的名字,第二个包含的值转换成两列:
library(tidyr)
library(dplyr)
DF %>% pivot_longer(everything(), names_to = "var",values_to = "values")
# A tibble: 136 x 2
var values
<chr> <dbl>
1 drug_6hr 0.876
2 Control_6 0.707
3 drug_24hr 0.709
4 Control_24 0.521
5 drug_6hr 0.084
6 Control_6 0.275
7 drug_24hr 0.468
8 Control_24 0.795
9 drug_6hr 0.911
10 Control_6 0.985
# … with 126 more rows
Then, you can add the graphic part to the pipe (symbol %>%) sequence by defining your dataframe into ggplot
with various aes
arguments and use geom_boxplot
and geom_jitter
functions:然后,您可以通过使用各种
aes
参数将数据帧定义为ggplot
并使用geom_boxplot
和geom_jitter
函数,将图形部分添加到管道(符号 %>%)序列中:
library(tidyr)
library(dplyr)
library(ggplot2)
DF %>% pivot_longer(everything(), names_to = "var",values_to = "values") %>%
ggplot(aes(x = var, y = values, fill = var, color = var))+
geom_boxplot(alpha = 0.2)+
geom_jitter()
Alternatively, to remove the warning messages based on the presence of NA
values, you can filter out NA
values by adding a filter
function between the pivot_longer
and ggplot
:或者,要根据
NA
值的存在删除警告消息,您可以通过在pivot_longer
和ggplot
之间添加filter
函数来过滤掉NA
值:
DF %>% pivot_longer(everything(), names_to = "var",values_to = "values") %>%
filter(!is.na(values)) %>%
ggplot(aes(x = var, y = values, fill = var, color = var))+
geom_boxplot(alpha = 0.2)+
geom_jitter()
Does it answer your question ?它回答你的问题吗?
Reproducible example可重现的例子
I edited your example in order to make it better for reading into R. I also modify colnames as pointed out by @akrun:我编辑了你的例子,以便更好地读入 R。我还修改了@akrun 指出的列名:
structure(list(drug_6hr = c(0.876, 0.084, 0.911, 0.503, 0.363,
0.219, 0.041, 0.206, 0.787, 0.896, 0.396, 0.25, 0.094, 0.75,
0.026, 0.109, 0.78, 0.71, 0.676, 0.61, 0.17, 0.025, 0.022, 0.039,
0.034, 0.402, 0.649, 0.564, 0.19, 0.844, 0.92, 0.744, 0.871,
0.565), Control_6 = c(0.707, 0.275, 0.985, 0.584, 0.102, 0.888,
0.277, 0.974, 0.725, 0.873, 0.641, 0.184, 0.453, 0.748, 0.263,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA), drug_24hr = c(0.709, 0.468, 0.565, 0.693, 0.775,
0.724, 0.877, 0.771, 0.671, 0.443, 0.525, 0.467, 0.641, 0.634,
0.069, 0.227, 0.811, 0.568, 0.114, 0.26, 0.728, 0.815, 0.329,
0.714, 0.096, 0.988, NA, NA, NA, NA, NA, NA, NA, NA), Control_24 = c(0.521,
0.795, 0.15, 0.766, 0.64, 0.516, 0.216, 0.434, 0.916, 0.693,
0.471, 0.537, 0.91, 0.007, 0.725, 0.535, 0.241, 0.029, 0.237,
0.241, 0.405, 0.914, 0.766, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA)), row.names = c(NA, -34L), class = c("data.table", "data.frame"
))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.