[英]How to produce a heatmap with ggplot2?
I am trying to produce a heat map using ggplot2.我正在尝试使用 ggplot2 生成热图。 I found this example , which I am essentially trying to replicate with my data, but I am having difficulty.
我找到了这个 example ,我基本上是在尝试用我的数据复制它,但是我遇到了困难。 My data is a simple .csv file that looks like this:
我的数据是一个简单的 .csv 文件,如下所示:
people,apple,orange,peach
mike,1,0,6
sue,0,0,1
bill,3,3,1
ted,1,1,0
I would like to produce a simple heat map where the name of the fruit is on the x-axis and the person is on the y-axis.我想制作一个简单的热图,其中水果的名称在 x 轴上,人在 y 轴上。 The graph should depict squares where the color of each square is a representation of the number of fruit consumed.
该图应描绘正方形,其中每个正方形的颜色代表消耗的水果数量。 The square corresponding to
mike:peach
should be the darkest.对应于
mike:peach
的方块应该是最暗的。
Here is the code I am using to try to produce the heatmap:这是我用来尝试生成热图的代码:
data <- read.csv("/Users/bunsen/Desktop/fruit.txt", head=TRUE, sep=",")
fruit <- c(apple,orange,peach)
people <- data[,1]
(p <- ggplot(data, aes(fruit, people)) + geom_tile(aes(fill = rescale), colour = "white") + scale_fill_gradient(low = "white", high = "steelblue"))
When I plot this data I get the number of fruit on the x-axis and people on the y-axis.当我绘制这个数据时,我得到了 x 轴上的水果数量和 y 轴上的人数。 I also do not get color gradients representing number of fruit.
我也没有得到代表水果数量的颜色渐变。 How can I get the names of the fruits on the x-axis with the number of fruit eaten by a person displayed as a heat map?
如何获得 x 轴上水果的名称,并将一个人吃的水果数量显示为热图? The current output I am getting in R looks like this:
我在 R 中获得的当前输出如下所示:
To be honest @dr.bunsen - your example above was poorly reproducable and you didn't read the first part of the tutorial that you linked .老实说@dr.bunsen - 你上面的例子很难重现,而且你没有阅读你链接的教程的第一部分。 Here is probably what you are looking for:
这可能是您正在寻找的内容:
library(reshape)
library(ggplot2)
library(scales)
data <- structure(list(people = structure(c(2L, 3L, 1L, 4L),
.Label = c("bill", "mike", "sue", "ted"),
class = "factor"),
apple = c(1L, 0L, 3L, 1L),
orange = c(0L, 0L, 3L, 1L),
peach = c(6L, 1L, 1L, 0L)),
.Names = c("people", "apple", "orange", "peach"),
class = "data.frame",
row.names = c(NA, -4L))
data.m <- melt(data)
data.m <- ddply(data.m, .(variable), transform, rescale = rescale(value))
p <- ggplot(data.m, aes(variable, people)) +
geom_tile(aes(fill = rescale), colour = "white")
p + scale_fill_gradient(low = "white", high = "steelblue")
Seven (!) years later, the best way to format your data correctly is to use tidyr
rather than reshape
七 (!) 年后,正确格式化数据的最佳方法是使用
tidyr
而不是reshape
Using gather
from tidyr
, it is very easy to reformat your data to get the expected 3 columns ( person
for the y-axis, fruit
for the x-axis and count
for the values):使用从
tidyr
gather
,可以很容易地重新格式化您的数据以获得预期的 3 列(y 轴为person
,x 轴为fruit
, count
为count
):
library("dplyr")
library("tidyr")
hm <- readr::read_csv("people,apple,orange,peach
mike,1,0,6
sue,0,0,1
bill,3,3,1
ted,1,1,0")
hm <- hm %>%
gather(fruit, count, apple:peach)
#syntax: key column (to create), value column (to create), columns to gather (will become (key, value) pairs)
The data now looks like:数据现在看起来像:
# A tibble: 12 x 3
people fruit count
<chr> <chr> <dbl>
1 mike apple 1
2 sue apple 0
3 bill apple 3
4 ted apple 1
5 mike orange 0
6 sue orange 0
7 bill orange 3
8 ted orange 1
9 mike peach 6
10 sue peach 1
11 bill peach 1
12 ted peach 0
Perfect!完美的! Let's get plotting.
让我们开始绘图。 The basic geom to do a heatmap with ggplot2 is
geom_tile
to which we'll provide aesthetic x
, y
and fill
.使用 ggplot2
geom_tile
热图的基本 geom 是geom_tile
,我们将为其提供美学x
, y
和fill
。
library("ggplot2")
ggplot(hm, aes(x=x, y=y, fill=value)) + geom_tile()
OK not too bad but we can do much better.还不错,但我们可以做得更好。
theme_bw()
which gets rid of the grey background.theme_bw()
。 I also like to use a palette from RColorBrewer
(with direction = 1
to get the darker colors for higher values, or -1 otherwise).我还喜欢使用来自
RColorBrewer
的调色板( direction = 1
以获得更高值的较深颜色,否则为 -1)。 There is a lot of available palettes: Reds, Blues, Spectral, RdYlBu (red-yellow-blue), RdBu (red-blue), etc. Below I use "Greens".有很多可用的调色板:红色、蓝色、光谱、RdYlBu(红-黄-蓝)、RdBu(红-蓝)等。下面我使用“绿色”。 Run
RColorBrewer::display.brewer.all()
to see what the palettes look like.运行
RColorBrewer::display.brewer.all()
以查看调色板的外观。
If you want the tiles to be squared, simply use coord_equal()
.如果您希望将瓷砖平方,只需使用
coord_equal()
。
I often find the legend is not useful but it depends on your particular use case.我经常发现图例没有用,但这取决于您的特定用例。 You can hide the
fill
legend with guides(fill=F)
.您可以使用
guides(fill=F)
隐藏fill
图例。
You can print the values on top of the tiles using geom_text
(or geom_label
).您可以使用
geom_text
(或geom_label
)在图块顶部打印值。 It takes aesthetics x
, y
and label
but in our case, x
and y
are inherited.它需要美学
x
, y
和label
但在我们的例子中, x
和y
是继承的。 You can also print higher values bigger by passing size=count
as an aesthetic -- in that case you will also want to pass size=F
to guides
to hide the size legend.您还可以通过将
size=count
作为美学传递来打印更高的值 - 在这种情况下,您还需要将size=F
传递给guides
以隐藏尺寸图例。
You can draw lines around the tiles by passing a color
to geom_tile
.您可以通过将
color
传递给geom_tile
来在图块周围绘制线条。
Putting it all together:把它们放在一起:
ggplot(hm, aes(x=fruit, y=people, fill=count)) +
# tile with black contour
geom_tile(color="black") +
# B&W theme, no grey background
theme_bw() +
# square tiles
coord_equal() +
# Green color theme for `fill`
scale_fill_distiller(palette="Greens", direction=1) +
# printing values in black
geom_text(aes(label=count), color="black") +
# removing legend for `fill` since we're already printing values
guides(fill=F) +
# since there is no legend, adding a title
labs(title = "Count of fruits per person")
To remove anything, simply remove the corresponding line.要删除任何内容,只需删除相应的行。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.