简体   繁体   中英

How to boxplot the summary of and data frame in R

I have a data frame containing 80 different features. Using summary(data) I can see the min max and average of each column. What I know would like to do is visualise these with a plot.

I would like to be able to see the min max range of each column as well as the mean. The goal is it to be able to visual see outliers and the range of the data. I tried using a box plot to do so, but I am unable to find the right way to plot it.

Any Help is appreciated. I already got the summary in a data frame doing the following:

summary <- as.data.frame(apply(data[,2:(ncol(data)-1)],2,summary))

Preview of the Data:

    f1  f2  f3  f4
1   1   0   0   0
2   0   0   0   0
3   0   0   0   0
4   1   0   0   1
5   0   0   0   0
6   2   1   0   0
7   2   0   0   0
8   0   0   0   0
9   0   0   0   0
10  0   0   0   0

structure(list(feat_1 = c(1L, 0L, 0L, 1L, 0L, 2L, 2L, 0L, 0L, 
0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), feat_2 = c(0L, 0L, 
0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L), feat_3 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 
2L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), feat_4 = c(0L, 0L, 0L, 1L, 
0L, 0L, 0L, 0L, 0L, 0L, 2L, 1L, 0L, 1L, 0L, 2L, 0L, 0L, 0L, 0L
)), row.names = c(NA, 20L), class = "data.frame")

This was my attempt using reshape and ggplot

  df <-structure(list(feat_1 = c(1L, 0L, 0L, 1L, 0L, 2L, 2L, 0L, 0L, 
0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), feat_2 = c(0L, 0L, 
0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L), feat_3 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 
2L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), feat_4 = c(0L, 0L, 0L, 1L, 
0L, 0L, 0L, 0L, 0L, 0L, 2L, 1L, 0L, 1L, 0L, 2L, 0L, 0L, 0L, 0L
)), row.names = c(NA, 20L), class = "data.frame")

Load packages

library(reshape2)
library(ggplot2)

melt the data frame

df.melted <- melt(df)

use ggplot, here alpha is the transparency and to get the mean use stat_summary with fun=mean

ggplot(df.melted,aes(factor(variable),value,fill=variable))+
geom_boxplot(alpha=0.6)+
stat_summary(fun=mean, geom="point",shape=21,size=3)

Output:

在此处输入图片说明

see ?ggplot for more details

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM