[英]Heatmap in R with ggplot2

I have a data frame T_mod with 150 observations and 2920 variables, containing subsurface temperature values in °C over one year. 我有一个包含150个观测值和2920个变量的数据框T_mod ,其中包含一年内以°C为单位的地下温度值。 It looks like this: 看起来像这样:

> T_mod[1:10, 1:6]
      t=-24548400 t=-24537600 t=-24526800 t=-24516000 t=-24505200 t=-24494400
z=0.1    9.000187    9.004622    9.009004    9.013332    9.017607    9.021829
z=0.2    8.587763    8.592795    8.597776    8.602705    8.607583    8.612410
z=0.3    8.179728    8.185313    8.190848    8.196334    8.201770    8.207157
z=0.4    7.776561    7.782655    7.788702    7.794702    7.800653    7.806558
z=0.5    7.378704    7.385267    7.391785    7.398256    7.404682    7.411062
z=0.6    6.986564    6.993556    7.000504    7.007408    7.014268    7.021084
z=0.7    6.600512    6.607894    6.615235    6.622533    6.629789    6.637003
z=0.8    6.220886    6.228623    6.236319    6.243975    6.251591    6.259166
z=0.9    5.847995    5.856050    5.864068    5.872046    5.879986    5.887887
z=1      5.482113    5.490454    5.498759    5.507026    5.515257    5.523450

The rownames stand for depth. 行名代表深度。 In 10 cm increments from 0.1 m to 15 m underground. 从地下0.1 m到15 m以10 cm的增量递增。 Colnames indicate time in elapsed seconds. 姓氏以秒为单位表示时间。 The cell values are temperatures in °C, for each point in time for a given depth. 对于给定深度,每个时间点的像元值均以°C为单位。

I want to create a heatmap showing temperatures along time on the x-axis and depth on the y-axis. 我想创建一个热图 ,在x轴上显示时间沿温度,在y轴上显示深度。 The plot below is created with the image.plot function in R base graphics using the following code: 下面的图是使用以下代码在R基本图形中使用image.plot函数创建的:

image.plot(z = t(as.matrix(T_mod[150:1,])), legend.lab = "Temperature (°C)",
           ylab = "Depth (m)", xlab = "Time")

The x axis represents time (one year in 3h intervals) and the y axis represents depth (0 to 15 m in 10 cm increments). x轴表示时间(以3h为间隔的一年),y轴表示深度(以10 cm为增量的0到15 m)。 Z values are temperatures for a given point in time and a specific depth. Z值是给定时间点和特定深度的温度。 Obvisously, the axes ticks and tick labels make little sense as of now. 显然,到目前为止,轴刻度和刻度标签没有什么意义。 The problem is the image and image.plot functions are somewhat rigid, not allowing to adjust axis ticks, labels, etc. 问题imageimage.plot函数有些僵化,不允许调整轴刻度,标签等。

Now, someone has pointed me towards ggplot2 for greater flexibility in adjusting plot parameters but I have not used ggplot so far. 现在,有人将我指向ggplot2,以提供更大的灵活性来调整绘图参数,但到目前为止我还没有使用过ggplot。 Consequently, the code below does not work. 因此,下面的代码不起作用。

ggplot(T_mod, aes(x=time, y=Depth, z=Temperature)) +
  geom_tile(aes(fill=Temperature)) +
  theme(panel.background = element_blank(),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        axis.line = element_line(colour = "black"),
        panel.border = element_rect(colour = "black", fill=NA, size=2))+
        # possibly use stat_contour(binwidth = 0.1,aes(colour = ..level..),size=0.1) +
        # ... and scale_fill_gradient(low = "red", high = "Green”) +
        # ... and scale_colour_gradient(low = "black", high = "black",guide = "none")+
  scale_y_continuous(expand = c(0,0),breaks=seq(20, 140, 20),limits=c(20,140),labels=lbl_y)+ 
  scale_x_continuous(expand = c(0,0),breaks=seq(124, 2796, 240),limits=c(124,2796),labels=lbl_x)+
  theme(axis.text.x = element_text(size = 15),axis.text.y = element_text(size = 15),axis.title.x = element_text(size = 15),axis.title.y = element_text(size = 15),plot.title = element_text(size=15))+
  ggtitle("Main title")

> lbl_y
[1]  -2  -4  -6  -8 -10 -12 -14
> lbl_x
 [1] "01 Sep" "01 Okt" "01 Nov" "01 Dez" "01 Jan" "01 Feb" "01 Mrz" "01 Apr" "01 Mai"
[10] "01 Jun" "01 Jul" "01 Aug"

The basic issue I believe is that I do not know how to assign depth, time, and temperature from the data frame to the aes() call in the first row. 我认为基本问题是,我不知道如何将数据帧中的深度,时间和温度分配给第一行的aes()调用。 Other examples use columns to specify that but the columns in my data frame indicate temperatures at one point in time and as infill I want all temperatures plotted. 其他示例使用列来指定,但我的数据框中的列指示某个时间点的温度,而作为填充,我希望绘制所有温度。 Any sugggestions on how to plot this with ggplot2 or how to make changes to the image.plot function above that allow axes to be set are greatly appreciated. 对于如何使用ggplot2进行绘制或如何对上面允许设置轴的image.plot函数进行更改的任何建议,都深表感谢。

I mentioned in the comment that I think you needed to gather your data, at least if it was presented as shown with time in columns and depth in rows. 我在评论中提到,我认为您需要收集数据,至少如果数据以时间(以列为单位)和深度(以行为单位)显示。 ggplot2 is designed to work with tidy data, where each row is an observation and variables are stored in columns. ggplot2设计用于处理整洁的数据,其中每一行都是观察值,变量存储在列中。 Here, that means you want just three columns, one for each of depth , temp and time , and each row is then a single measurement. 在这里,这意味着您只需要三列,每一列分别用于depthtemptime ,然后每一行都是一个度量。 You can do this with the code below. 您可以使用以下代码执行此操作。

  1. Use gather to combine all the time columns into a single one 使用gather将所有时间列合并为一个
  2. Use separate to split up the time and row values into just the numeric part 使用separate将时间和行值拆分为数字部分
  3. Use select to drop unneeded variables 使用select删除不需要的变量
  4. Use mutate_at to convert the values stored as strings into numbers 使用mutate_at将存储为字符串的值转换为数字

Then, ggplot becomes easy to use. 然后, ggplot变得易于使用。 geom_tile is designed for three main aesthetics, x , y , and fill . geom_tile设计用于三种主要的美学, xyfill We just call geom_tile and map its aesthetics to the variables we want, and produce the plot below. 我们只调用geom_tile并将其美感映射到我们想要的变量,然后生成下面的图。 I include scale_fill_viridis which changes the colours to perceptually uniform ones, but that isn't necessary. 我包含了scale_fill_viridis ,它将颜色更改为在感知上均匀的颜色,但这不是必需的。 You might not need all these steps if your data isn't stored exactly as shown. 如果您的数据未完全按照所示存储,则可能不需要所有这些步骤。

As far as the axis ticks go, you probably do want scale_x_continuous but I am not sure what units your time values are in right now. 就轴刻度而言,您可能确实想要scale_x_continuous但是我不确定您的时间值现在处于什么单位。

For more info on tidy data and on ggplot , try these chapters. 有关整洁数据ggplot的更多信息,请尝试以下章节。

tbl <- read_table2(
  "depth   t=-24548400 t=-24537600 t=-24526800 t=-24516000 t=-24505200 t=-24494400
  z=0.1    9.000187    9.004622    9.009004    9.013332    9.017607    9.021829
  z=0.2    8.587763    8.592795    8.597776    8.602705    8.607583    8.612410
  z=0.3    8.179728    8.185313    8.190848    8.196334    8.201770    8.207157
  z=0.4    7.776561    7.782655    7.788702    7.794702    7.800653    7.806558
  z=0.5    7.378704    7.385267    7.391785    7.398256    7.404682    7.411062
  z=0.6    6.986564    6.993556    7.000504    7.007408    7.014268    7.021084
  z=0.7    6.600512    6.607894    6.615235    6.622533    6.629789    6.637003
  z=0.8    6.220886    6.228623    6.236319    6.243975    6.251591    6.259166
  z=0.9    5.847995    5.856050    5.864068    5.872046    5.879986    5.887887
  z=1      5.482113    5.490454    5.498759    5.507026    5.515257    5.523450"

tidy_tbl <- tbl %>%
  gather(key = "time", value = "temp", starts_with("t=")) %>%
  separate(depth, c("z", "depth"), sep = "=") %>%
  separate(time, c("t", "time"), sep = "-") %>%
  select(-z, -t) %>%
  mutate_at(vars(depth, time), as.numeric) %>%
# A tibble: 60 x 3
   depth     time  temp
   <dbl>    <dbl> <dbl>
 1 0.100 24548400  9.00
 2 0.200 24548400  8.59
 3 0.300 24548400  8.18
 4 0.400 24548400  7.78
 5 0.500 24548400  7.38
 6 0.600 24548400  6.99
 7 0.700 24548400  6.60
 8 0.800 24548400  6.22
 9 0.900 24548400  5.85
10 1.00  24548400  5.48
# ... with 50 more rows

ggplot(data = tidy_tbl) +
  theme_bw() +
  geom_tile(aes(x = time, y = depth, fill = temp)) +
  scale_fill_viridis(name = "Temp") + 
  labs(x = "Time", y = "Depth")


I agree with the other statements that the data need to be reshaped to be in tidy format. 我同意其他说法,即数据需要重整为整齐的格式。 I just wanted to add that geom_raster() rather than geom_tile() is generally the better option for large heatmaps. 我只想添加geom_raster()而不是geom_tile()通常是大型热图的更好选择。 It is optimized for large raster datasets and it is way faster. 它针对大型栅格数据集进行了优化,并且速度更快。 Example follows below (using the built-in volcano data, since I don't have your dataset). 下面是示例(使用内置的volcano数据,因为我没有您的数据集)。


# create tidy version of volcano data
nx = 87
ny = 61
volcano_data <- data.frame(height = c(volcano), x = rep(1:nx, ny), y = rep(1:ny, each = nx))

# take a look at the dataset. it's indeed tidy.
#   height x y
# 1    100 1 1
# 2    101 2 1
# 3    102 3 1
# 4    103 4 1
# 5    104 5 1
# 6    105 6 1

# plot
ggplot(volcano_data, aes(x, y, fill=height)) + 
  geom_raster() + 
  coord_fixed(expand = FALSE) +


geom_raster() also allows you to interpolate between adjacent colors for a smoother appearance. geom_raster()还允许您在相邻颜色之间进行插值,以使外观更平滑。 This may or may not be useful to you: 这可能对您没有用处:

ggplot(volcano_data, aes(x, y, fill=height)) + 
  geom_raster(interpolate = TRUE) + 
  coord_fixed(expand = FALSE) +


