简体   繁体   中英

How to plot a confusion matrix using heatmaps in R?

I have a confusion matrix such that:

  a b c d e f g h i j
a 5 4 0 0 0 0 0 0 0 0
b 0 0 0 0 0 0 0 0 0 0
c 0 0 4 0 0 0 0 0 0 0
d 0 0 0 0 0 0 0 0 0 0
e 2 0 0 0 2 0 0 0 0 0
f 1 0 0 0 0 2 0 0 0 0
g 0 0 0 0 0 0 0 0 0 0
h 0 0 0 0 0 0 0 0 0 0 
i 0 0 0 0 0 0 0 0 0 0 
j 0 0 0 0 0 0 0 0 0 0 

where the letters denote the class labels.

I just need to plot the confusion matrix. I searched a couple of tools. Heatmaps in R looks like what I need. As I don't know anything about R, it is really hard to do changes on the samples. If anybody could help me shortly how to draw, I will be really appreciated. Or any other suggestion rather than heatmaps are welcome as well. I know there is plenty of samples about this, but still I cannot manage to draw with my own data.

You can achieve a nice result using ggplot2 , but for that you need a data.frame with 3 columns for x, y and the value to plot.

Using gather from the tidyr tool it is very easy to reformat your data:

library("dplyr")
library("tidyr")

# Loading your example. Row names should get their own column (here `y`).
hm <- readr::read_delim("y a b c d e f g h i j
a 5 4 0 0 0 0 0 0 0 0
b 0 0 0 0 0 0 0 0 0 0
c 0 0 4 0 0 0 0 0 0 0
d 0 0 0 0 0 0 0 0 0 0
e 2 0 0 0 2 0 0 0 0 0
f 1 0 0 0 0 2 0 0 0 0
g 0 0 0 0 0 0 0 0 0 0
h 0 0 0 0 0 0 0 0 0 0
i 0 0 0 0 0 0 0 0 0 0
j 0 0 0 0 0 0 0 0 0 0", delim=" ")

# Gathering columns a to j
hm <- hm %>% gather(x, value, a:j)

# hm now looks like:
# # A tibble: 100 x 3
# y     x     value
# <chr> <chr> <dbl>
# 1 a     a         5
# 2 b     a         0
# 3 c     a         0
# 4 d     a         0
# 5 e     a         2
# # ... with 95 more rows

Perfect! Let's get plotting. the basic geom for heatmap with ggplot2 is geom_tile to which we'll provide aesthetic x , y and fill .

library("ggplot2")
ggplot(hm, aes(x=x, y=y, fill=value)) + geom_tile() 

第一次尝试热图

OK not too bad but we can do much better. First we probably want to reverse the y axis. The trick is to provide x and y as factors with the levels ordered as we want them.

hm <- hm %>%
  mutate(x = factor(x), # alphabetical order by default
         y = factor(y, levels = rev(unique(y)))) # force reverse alphabetical order

Then I like the black & white theme theme_bw() which gets rid of the grey background. I also like to use a palette from RColorBrewer (with direction = 1 to get the darker colors for higher values).

Since you're plotting the same thing on the x and y axis, you probably want equal axis scales: coord_equal() will give you a square plot.

ggplot(hm, aes(x=x, y=y, fill=value)) +
  geom_tile() + theme_bw() + coord_equal() +
  scale_fill_distiller(palette="Greens", direction=1) 
# Other valid palettes: Reds, Blues, Spectral, RdYlBu (red-yellow-blue), ...

更好的热图

The finishing touch: printing the values on top of the tiles and removing the legend since it is not longer useful. Obviously this is all optional but it gives you material to build from. Note geom_text inherits the x and y aesthetics since they were passed to ggplot .

ggplot(hm, aes(x=x, y=y, fill=value)) +
  geom_tile() + theme_bw() + coord_equal() +
  scale_fill_distiller(palette="Greens", direction=1) +
  guides(fill=F) + # removing legend for `fill`
  labs(title = "Value distribution") + # using a title instead
  geom_text(aes(label=value), color="black") # printing values

最终热图

You could also pass color="black" to geom_tile to draw (black) lines around the tiles. A final plot with the RdYlBu color scheme (see RColorBrewer::display.brewer.all() for a list of available palettes).

展示更多选择

As Greg mentioned, image is probably the way to go:

z = c(5,4,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,
0,0,4,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,
2,0,0,0,2,0,0,0,0,0,
1,0,0,0,0,2,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0)

z = matrix(z, ncol=10)
colnames(z) = c("a","b","c","d","e","f","g","h","i", "j")
rownames(z) = c("a","b","c","d","e","f","g","h","i", "j")

##To get the correct image plot rotation
##We need to flip the plot
image(z[,ncol(z):1], axes=FALSE)

##Add in the y-axis labels. Similar idea for x-axis.
axis(2, at = seq(0, 1, length=length(colnames(z))), labels=colnames(z))

You may also want to look at the heatmap function:

heatmap(t(z)[ncol(z):1,], Rowv=NA,
               Colv=NA, col = heat.colors(256))

The image function in R will take a matrix and plot a regular grid with colors based on the values in the matrix. You can set a lot of options, but just calling image with your matrix as the only argument will create a basic plot. Sounds like that would be a good place to start.

Unfortunately, the image function suggested in another answer cannot be used as such because it reverses (mirror) the data, so you'll get it the wrong way. With a little transform you can coin a function that will plot it right:

set.seed(1)
d = data.frame(Y_label=rpois(100,1), pred=rpois(100,1))
Show = function(df, ...) {image(t(df[nrow(df):1,]), ...)}
Show(table(d), main="my confusion matrix")

在此处输入图片说明

Next step you can add some axis labels, customize it, etc.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM