简体   繁体   中英

How to plot columns of csv data in R using boxplots

I have a sample dataframe that is 3600 rows long by 6 columns wide. I want to create plot in R that will show six boxplots, one for each of the 6 columns of data. I am using ggplot. I can create them in excel easy enough (shown below) but want to be able to do it in R as my future dataframes are going to be much larger and R seems to handle large datasets a lot easier.

excel绘图

Using the code below I can plot the first column fine, but can't figure out how to add the data from the other 5 columns.

ggplot(data=df)+
 geom_boxplot(aes(x="Label", y=col1))

Using geom_boxplot from ggplot2

To get a boxplot for each of your 6 columns with ggplot2 , you need to reshape first your dataframe into a longer format in order to match the grammar of ggplot2 (one column for x values, one column for y values and one or more column as categorical values). Then, you can use ggplot2 and geom_boxplot function:

Here, an example using the included iris dataset:

> head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

Using, pivot_longer function from tidyr package you can reshape the first 4 columns of this dataset into a longer format:

library(tidyr)
library(dplyr)
iris2 <- iris %>% pivot_longer(cols = Sepal.Length:Petal.Width, names_to = 
"Var", values_to = "val")

# A tibble: 600 x 3
   Species Var            val
   <fct>   <chr>        <dbl>
 1 setosa  Sepal.Length   5.1
 2 setosa  Sepal.Width    3.5
 3 setosa  Petal.Length   1.4
 4 setosa  Petal.Width    0.2
 5 setosa  Sepal.Length   4.9
 6 setosa  Sepal.Width    3  
 7 setosa  Petal.Length   1.4
 8 setosa  Petal.Width    0.2
 9 setosa  Sepal.Length   4.7
10 setosa  Sepal.Width    3.2
# … with 590 more rows

And then, you can use this new dataset in ggplot2 for getting boxplot for each of values of Var :

library(ggplot2)
ggplot(iris2, aes(x = Var, y = val, fill  = Var))+
  geom_boxplot()

在此处输入图片说明


Alternative using base r

Without the need to reshape your dataframe, you can get the boxplot right away by using boxplot function in base r :

boxplot(iris[,c(1:4)], col = c("red","green","blue","orange"))

在此处输入图片说明

Does it answer your question ?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM