简体   繁体   中英

box-plot for multiple columns with normalized x-axis values

I have the following data (in csv file)

 product release_after_issue  release_before_issue
 P1                           40
 P1      100    
 P1                           10
 P2      50
 P2      300
 P2                           200
 P3      10
 P3      20
 P3      300    

I would like use the box-plot to show the distribution of days for each product release (P1, P2, etc.) based on release_after_issue and release_before_issue . The x-axis is the products names and y-axis is days.

The issues that I am facing now are:the empty values in each column, and the big number for the days.

How could I normalize the days in y-axis to be in month (easy to read)? And I wold like to have each product (Ps) has its own box plot based on the column's data ( release_after_issue or release_before_issue )

I tried to omit NA values and plot test example, but it did not work

data <- read.csv("commons-fileupload.csv")
    ggplot(data[!is.na(data$release_after_issue),],aes(x=product,y=release_after_issue))
    + geom_point()

Any help !

Not sure what fails in your code, the dummy data below works fine for me. Also, ggplot removes the NAs for you.

data <- data.frame(product=c("P1","P2","P1","P1","P2"),release_after_issue=c(100,NA,50,10,30))
ggplot(data,aes(x=product,y=release_after_issue))+ geom_boxplot()

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM