[英]How to plot columns of csv data in R using boxplots
I have a sample dataframe that is 3600 rows long by 6 columns wide.我有一个 3600 行长 x 6 列宽的示例数据框。 I want to create plot in R that will show six boxplots, one for each of the 6 columns of data.
我想在 R 中创建图,该图将显示六个箱线图,6 列数据中的每一列都有一个箱线图。 I am using ggplot.
我正在使用 ggplot。 I can create them in excel easy enough (shown below) but want to be able to do it in R as my future dataframes are going to be much larger and R seems to handle large datasets a lot easier.
我可以很容易地在 excel 中创建它们(如下所示),但希望能够在 R 中完成,因为我未来的数据框会更大,而且 R 似乎更容易处理大型数据集。
Using the code below I can plot the first column fine, but can't figure out how to add the data from the other 5 columns.使用下面的代码,我可以很好地绘制第一列,但无法弄清楚如何添加其他 5 列的数据。
ggplot(data=df)+
geom_boxplot(aes(x="Label", y=col1))
Using geom_boxplot
from ggplot2
使用
geom_boxplot
的ggplot2
To get a boxplot for each of your 6 columns with ggplot2
, you need to reshape first your dataframe into a longer format in order to match the grammar of ggplot2
(one column for x values, one column for y values and one or more column as categorical values).要使用
ggplot2
为 6 列中的每一列获取箱线图,您需要首先将数据帧重塑为更长的格式以匹配ggplot2
的语法(一列用于 x 值,一列用于 y 值,一列或多列作为分类值)。 Then, you can use ggplot2
and geom_boxplot
function:然后,您可以使用
ggplot2
和geom_boxplot
函数:
Here, an example using the included iris
dataset:这里是使用包含的
iris
数据集的示例:
> head(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
Using, pivot_longer
function from tidyr
package you can reshape the first 4 columns of this dataset into a longer format:使用
tidyr
包中的pivot_longer
函数,您可以将此数据集的前 4 列重塑为更长的格式:
library(tidyr)
library(dplyr)
iris2 <- iris %>% pivot_longer(cols = Sepal.Length:Petal.Width, names_to =
"Var", values_to = "val")
# A tibble: 600 x 3
Species Var val
<fct> <chr> <dbl>
1 setosa Sepal.Length 5.1
2 setosa Sepal.Width 3.5
3 setosa Petal.Length 1.4
4 setosa Petal.Width 0.2
5 setosa Sepal.Length 4.9
6 setosa Sepal.Width 3
7 setosa Petal.Length 1.4
8 setosa Petal.Width 0.2
9 setosa Sepal.Length 4.7
10 setosa Sepal.Width 3.2
# … with 590 more rows
And then, you can use this new dataset in ggplot2
for getting boxplot for each of values of Var
:然后,您可以在
ggplot2
使用这个新数据集来获取每个Var
值的箱线图:
library(ggplot2)
ggplot(iris2, aes(x = Var, y = val, fill = Var))+
geom_boxplot()
Alternative using base r
使用
base r
替代方法
Without the need to reshape your dataframe, you can get the boxplot right away by using boxplot
function in base r
:无需重塑数据框,您可以通过在
base r
使用boxplot
函数立即获得 boxplot :
boxplot(iris[,c(1:4)], col = c("red","green","blue","orange"))
Does it answer your question ?它回答你的问题吗?
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.