[英]Heatmap in R with ggplot2
I have a data frame T_mod
with 150 observations and 2920 variables, containing subsurface temperature values in °C over one year. 我有一个包含150个观测值和2920个变量的数据框
T_mod
,其中包含一年内以°C为单位的地下温度值。 It looks like this: 看起来像这样:
> T_mod[1:10, 1:6]
t=-24548400 t=-24537600 t=-24526800 t=-24516000 t=-24505200 t=-24494400
z=0.1 9.000187 9.004622 9.009004 9.013332 9.017607 9.021829
z=0.2 8.587763 8.592795 8.597776 8.602705 8.607583 8.612410
z=0.3 8.179728 8.185313 8.190848 8.196334 8.201770 8.207157
z=0.4 7.776561 7.782655 7.788702 7.794702 7.800653 7.806558
z=0.5 7.378704 7.385267 7.391785 7.398256 7.404682 7.411062
z=0.6 6.986564 6.993556 7.000504 7.007408 7.014268 7.021084
z=0.7 6.600512 6.607894 6.615235 6.622533 6.629789 6.637003
z=0.8 6.220886 6.228623 6.236319 6.243975 6.251591 6.259166
z=0.9 5.847995 5.856050 5.864068 5.872046 5.879986 5.887887
z=1 5.482113 5.490454 5.498759 5.507026 5.515257 5.523450
The rownames stand for depth. 行名代表深度。 In 10 cm increments from 0.1 m to 15 m underground.
从地下0.1 m到15 m以10 cm的增量递增。 Colnames indicate time in elapsed seconds.
姓氏以秒为单位表示时间。 The cell values are temperatures in °C, for each point in time for a given depth.
对于给定深度,每个时间点的像元值均以°C为单位。
I want to create a heatmap showing temperatures along time on the x-axis and depth on the y-axis. 我想创建一个热图 ,在x轴上显示时间沿温度,在y轴上显示深度。 The plot below is created with the image.plot function in R base graphics using the following code:
下面的图是使用以下代码在R基本图形中使用image.plot函数创建的:
image.plot(z = t(as.matrix(T_mod[150:1,])), legend.lab = "Temperature (°C)",
ylab = "Depth (m)", xlab = "Time")
The x axis represents time (one year in 3h intervals) and the y axis represents depth (0 to 15 m in 10 cm increments). x轴表示时间(以3h为间隔的一年),y轴表示深度(以10 cm为增量的0到15 m)。 Z values are temperatures for a given point in time and a specific depth.
Z值是给定时间点和特定深度的温度。 Obvisously, the axes ticks and tick labels make little sense as of now.
显然,到目前为止,轴刻度和刻度标签没有什么意义。 The problem is the
image
and image.plot
functions are somewhat rigid, not allowing to adjust axis ticks, labels, etc. 问题是
image
和image.plot
函数有些僵化,不允许调整轴刻度,标签等。
Now, someone has pointed me towards ggplot2 for greater flexibility in adjusting plot parameters but I have not used ggplot so far. 现在,有人将我指向ggplot2,以提供更大的灵活性来调整绘图参数,但到目前为止我还没有使用过ggplot。 Consequently, the code below does not work.
因此,下面的代码不起作用。
ggplot(T_mod, aes(x=time, y=Depth, z=Temperature)) +
geom_tile(aes(fill=Temperature)) +
theme(panel.background = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.line = element_line(colour = "black"),
panel.border = element_rect(colour = "black", fill=NA, size=2))+
ylab("Depth")+
xlab("Time")+
# possibly use stat_contour(binwidth = 0.1,aes(colour = ..level..),size=0.1) +
# ... and scale_fill_gradient(low = "red", high = "Green”) +
# ... and scale_colour_gradient(low = "black", high = "black",guide = "none")+
scale_y_continuous(expand = c(0,0),breaks=seq(20, 140, 20),limits=c(20,140),labels=lbl_y)+
scale_x_continuous(expand = c(0,0),breaks=seq(124, 2796, 240),limits=c(124,2796),labels=lbl_x)+
coord_cartesian(ylim=c(1,150),xlim=c(1,2920))+
theme(axis.text.x = element_text(size = 15),axis.text.y = element_text(size = 15),axis.title.x = element_text(size = 15),axis.title.y = element_text(size = 15),plot.title = element_text(size=15))+
ggtitle("Main title")
> lbl_y
[1] -2 -4 -6 -8 -10 -12 -14
> lbl_x
[1] "01 Sep" "01 Okt" "01 Nov" "01 Dez" "01 Jan" "01 Feb" "01 Mrz" "01 Apr" "01 Mai"
[10] "01 Jun" "01 Jul" "01 Aug"
The basic issue I believe is that I do not know how to assign depth, time, and temperature from the data frame to the aes()
call in the first row. 我认为基本问题是,我不知道如何将数据帧中的深度,时间和温度分配给第一行的
aes()
调用。 Other examples use columns to specify that but the columns in my data frame indicate temperatures at one point in time and as infill I want all temperatures plotted. 其他示例使用列来指定,但我的数据框中的列指示某个时间点的温度,而作为填充,我希望绘制所有温度。 Any sugggestions on how to plot this with ggplot2 or how to make changes to the image.plot function above that allow axes to be set are greatly appreciated.
对于如何使用ggplot2进行绘制或如何对上面允许设置轴的image.plot函数进行更改的任何建议,都深表感谢。
I mentioned in the comment that I think you needed to gather your data, at least if it was presented as shown with time in columns and depth in rows. 我在评论中提到,我认为您需要收集数据,至少如果数据以时间(以列为单位)和深度(以行为单位)显示。
ggplot2
is designed to work with tidy data, where each row is an observation and variables are stored in columns. ggplot2
设计用于处理整洁的数据,其中每一行都是观察值,变量存储在列中。 Here, that means you want just three columns, one for each of depth
, temp
and time
, and each row is then a single measurement. 在这里,这意味着您只需要三列,每一列分别用于
depth
, temp
和time
,然后每一行都是一个度量。 You can do this with the code below. 您可以使用以下代码执行此操作。
gather
to combine all the time columns into a single one gather
将所有时间列合并为一个 separate
to split up the time and row values into just the numeric part separate
将时间和行值拆分为数字部分 select
to drop unneeded variables select
删除不需要的变量 mutate_at
to convert the values stored as strings into numbers mutate_at
将存储为字符串的值转换为数字 Then, ggplot
becomes easy to use. 然后,
ggplot
变得易于使用。 geom_tile
is designed for three main aesthetics, x
, y
, and fill
. geom_tile
设计用于三种主要的美学, x
, y
和fill
。 We just call geom_tile
and map its aesthetics to the variables we want, and produce the plot below. 我们只调用
geom_tile
并将其美感映射到我们想要的变量,然后生成下面的图。 I include scale_fill_viridis
which changes the colours to perceptually uniform ones, but that isn't necessary. 我包含了
scale_fill_viridis
,它将颜色更改为在感知上均匀的颜色,但这不是必需的。 You might not need all these steps if your data isn't stored exactly as shown. 如果您的数据未完全按照所示存储,则可能不需要所有这些步骤。
As far as the axis ticks go, you probably do want scale_x_continuous
but I am not sure what units your time values are in right now. 就轴刻度而言,您可能确实想要
scale_x_continuous
但是我不确定您的时间值现在处于什么单位。
For more info on tidy data and on ggplot , try these chapters. 有关整洁数据和ggplot的更多信息,请尝试以下章节。
library(tidyverse)
library(viridis)
tbl <- read_table2(
"depth t=-24548400 t=-24537600 t=-24526800 t=-24516000 t=-24505200 t=-24494400
z=0.1 9.000187 9.004622 9.009004 9.013332 9.017607 9.021829
z=0.2 8.587763 8.592795 8.597776 8.602705 8.607583 8.612410
z=0.3 8.179728 8.185313 8.190848 8.196334 8.201770 8.207157
z=0.4 7.776561 7.782655 7.788702 7.794702 7.800653 7.806558
z=0.5 7.378704 7.385267 7.391785 7.398256 7.404682 7.411062
z=0.6 6.986564 6.993556 7.000504 7.007408 7.014268 7.021084
z=0.7 6.600512 6.607894 6.615235 6.622533 6.629789 6.637003
z=0.8 6.220886 6.228623 6.236319 6.243975 6.251591 6.259166
z=0.9 5.847995 5.856050 5.864068 5.872046 5.879986 5.887887
z=1 5.482113 5.490454 5.498759 5.507026 5.515257 5.523450"
)
tidy_tbl <- tbl %>%
gather(key = "time", value = "temp", starts_with("t=")) %>%
separate(depth, c("z", "depth"), sep = "=") %>%
separate(time, c("t", "time"), sep = "-") %>%
select(-z, -t) %>%
mutate_at(vars(depth, time), as.numeric) %>%
print()
# A tibble: 60 x 3
depth time temp
<dbl> <dbl> <dbl>
1 0.100 24548400 9.00
2 0.200 24548400 8.59
3 0.300 24548400 8.18
4 0.400 24548400 7.78
5 0.500 24548400 7.38
6 0.600 24548400 6.99
7 0.700 24548400 6.60
8 0.800 24548400 6.22
9 0.900 24548400 5.85
10 1.00 24548400 5.48
# ... with 50 more rows
ggplot(data = tidy_tbl) +
theme_bw() +
geom_tile(aes(x = time, y = depth, fill = temp)) +
scale_fill_viridis(name = "Temp") +
labs(x = "Time", y = "Depth")
I agree with the other statements that the data need to be reshaped to be in tidy format. 我同意其他说法,即数据需要重整为整齐的格式。 I just wanted to add that
geom_raster()
rather than geom_tile()
is generally the better option for large heatmaps. 我只想添加
geom_raster()
而不是geom_tile()
通常是大型热图的更好选择。 It is optimized for large raster datasets and it is way faster. 它针对大型栅格数据集进行了优化,并且速度更快。 Example follows below (using the built-in
volcano
data, since I don't have your dataset). 下面是示例(使用内置的
volcano
数据,因为我没有您的数据集)。
library(ggplot2)
library(viridis)
# create tidy version of volcano data
nx = 87
ny = 61
volcano_data <- data.frame(height = c(volcano), x = rep(1:nx, ny), y = rep(1:ny, each = nx))
# take a look at the dataset. it's indeed tidy.
head(volcano_data)
# height x y
# 1 100 1 1
# 2 101 2 1
# 3 102 3 1
# 4 103 4 1
# 5 104 5 1
# 6 105 6 1
# plot
ggplot(volcano_data, aes(x, y, fill=height)) +
geom_raster() +
coord_fixed(expand = FALSE) +
scale_fill_viridis()
geom_raster()
also allows you to interpolate between adjacent colors for a smoother appearance. geom_raster()
还允许您在相邻颜色之间进行插值,以使外观更平滑。 This may or may not be useful to you: 这可能对您没有用处:
ggplot(volcano_data, aes(x, y, fill=height)) +
geom_raster(interpolate = TRUE) +
coord_fixed(expand = FALSE) +
scale_fill_viridis()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.