简体   繁体   中英

Plot correlation matrix in R

I have a csv file containings a matrix:

version getSize() length() ... power
0         23000    23421        0.8
0           ..      ..           ..
1           ..      ..           ..
1           ..      ..           ..

I want to aggregate by similar versions applying the mean function to the columns. The columns are too many to write them. I also want to calculate the correlation matrix and binding the power column at the sides of the plot. My code is this:

matrix <- read.csv("/home/francesco/University/UoA/matrix.csv", header=TRUE, sep=",", fileEncoding="windows-1252")
power <- matrix[,"power"]
binded <- cbind(matrix,power)
aggregated <- aggregate(. ~ version, data = binded, mean)
corMatrix <- cor(aggregated, method="spearman")
library(lattice)
levelplot(corMatrix)

The plot is pretty confused and I get this warning:

Warning message:
In cor(aggregated, method = "spearman") : standard deviation is zero

A short extract of matrix.csv is:

version,native_drawBitmap,nPrepareDirty,nDrawDisplayList,startGC,power
00083,8,88,308,12,0.8967960131052847
00083,0,176,404,1,0.867644513259528
00084,8,88,307,10,0.8980234065469381
00084,0,181,408,1,0.871799879659241

Someone knows what I'm doing wrong?

Thanks in advance

Well, with your sample data, the native_drawBitmap column becomes all 4's. Since this has no variance, you can't calculate a pair-wise correlation with any other variables and you get the error. If you leave out this column, it will work. Here is an example.

#sample data in friendly copy/paste-able format
mm<-data.frame(
    version = c(83, 83, 84, 84), 
    native_drawBitmap = c(8, 0, 8, 0),
    nPrepareDirty = c(88, 176, 88, 181), 
    nDrawDisplayList = c(308, 404, 307, 408), 
    startGC = c(12, 1, 10, 1), 
    power = c(0.896796013105285, 0.867644513259528, 
        0.898023406546938, 0.871799879659241)
)

# these are not needed and don't make sence. Why are you
#trying to re-add the column from mm back onto mm?
# power <- mm[,"power"]
# binded <- cbind(mm,power)
aggregated <- aggregate(. ~ version, data = mm, mean)

#error
corMatrix <- cor(aggregated, method="spearman")
#no error
corMatrix <- cor(aggregated[,-2], method="spearman")

You may have other columns in your data that have no variability after aggregation. Be sure to find these and remove them.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM