简体   繁体   中英

Heatmap visualization does not accurately portray binary values in matrix in R

I have an binary adjacency matrix in a csv file here , where 0= are not friends, and 1 = are friends. Using Nathan Yau's quick and easy heatmap tutorial, I tried to make a heatmap visualization with only two colors. I used the code below.

> test <- read.csv("/Users/Cindy/Desktop/untitled.csv", sep=",")
> row.names(test) <- test$name
> test <- test[,2:108]
> test_matrix <- data.matrix(test)
> dim(test)
[1] 107 107
> test_heatmap <- heatmap(test_matrix, Rowv=NA, Colv=NA, col = cm.colors(2), scale="column", margins=c(10,10))

For some reason, this happens: see image .

If you look at the csv file, there should be a lot more of the purple squares in the visualization, and there are confusing white lines in my visualization.

If someone could help me figure out what is wrong, I would very much appreciate it!

The problem is that the number of colors is so small that the threshold to be colored is not being met by most of pairs which have only "1" as an entry. to tlook at the distribution of counts try:

table(test_matrix)
test_matrix
   0    1    2    3    4    5    6    7    8    9   10   11   12   13   14   15 
9434 2016    1    1    1    1    1    1    1    1    1    1    1    1    1    1 
  16   17   18   19   20   21   22   23   24   25   26   27   28   29   30   31 
   1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1 
  32   33   34   35   36   37   38   39   40   41   42   43   44   45   46   47 
   1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1 
  48   49   50   51   52   53   54   55   56   57   58   59   60   61   62   63 
   1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1 
  64   65   66   67   68   69   70   71   72   73   74   75   76   77   78   79 
   1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1 
  80   81   82   83   84   85   86   87   88   89   90   91   92   93   94   95 
   1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1 
  96   97   98   99  100  101  102  103  104  105  106  107 
   1    1    1    1    1    1    1    1    1    1    1    1 

Increasing the number of colors is more informative:

test_heatmap <- heatmap(test_matrix, Rowv=NA, Colv=NA, 
                col = cm.colors(100), scale="column", margins=c(10,10))

If you have the gplots package installed, try the heatmap.2() function, it supports the same syntax, but you get a color key/legend which might give you more information of what is going on in terms of the color breakdown.

A useful thing to do would be to create your own "color palette" for those 2 colors, eg, in your case it's simple, just feed col = c("violet", "turquoise1")

Also useful would be to create an additional heat map with cellnotes . This would just be a heat map where you display the values in the cells. Then you can see whether the 0 s and 1 are assigned correctly, and what is going on in those white cells.

It would look somewhat like this 在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM