简体   繁体   中英

R: use “identify” to find the column names in a boxplot

In R, I'm drawing a rather large boxplot from a data.frame with approximately 150 columns. I know that there are some "anomalous" columns where the distribution is too different from the rest of the data set and I want to identify which ones precisely.

Rather unsurprisingly, there is not enough room for the labels and even if there were, it would be probably inconvenient to check by hand. So I thought I could use R's identify function to locate the offending columns. Such a function however needs x and y coordinates, and so far I was unable to get it to work.

I tried

boxplot(dd.noctr$TGS, outline=F)
identify(xy.coords(dd.noctr$TGS)$x, y=xy.coords(dd.noctr$TGS)$y)

where dd.noctr$TGS is my data (a matrix or data.frame), only to get the error

warning: no point within 0.25 inches

meaning that no point was identified.

Is there an alternative solution to identify column names (not single points)?

This solution seems a bit clunky, so there is probably a better solution.

  1. Set up some example data with three columns:

     TGS = data.frame(A = rnorm(100), B = rnorm(100), C=rnorm(100)) 
  2. Next plot the boxplot

     boxplot(TGS, outline=F) 
  3. Now we construct the identity function.

     identify(x=rep(1:ncol(TGS), each=nrow(TGS)), y=as.vector(unlist(TGS)), label=rep(colnames(TGS), each=nrow(TGS))) 

    The labels are the column names. This function only works if you click near the centre of the boxplot.

在此处输入图片说明

If you want to get a list of outliers, you can use the 'out' component of boxplot.

example: Create a dataframe : with a few random values with mean 20, and add some outliers. This code will display the outliers.

 df1 = data.frame(A = c(rnorm(15,20,3),7,8,35,32))   #15 rnorm and 4 extreme values
 bplot=boxplot(df1)
 bplot$out

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM