I am working with some temperature data where I have temperatures at certain depths eg 0.9m, 2.5m and 5m. I would like to interpolate this values so I obtain the temperature each meter, eg 1m, 2m and 3m. The original data looks like this:
df
# A tibble: 5 x 3
date d_0.9 d_2.5
<dttm> <dbl> <dbl>
1 2004-01-05 03:00:00 7 8
2 2004-01-05 04:00:00 7.5 9
3 2004-01-05 05:00:00 7 8
4 2004-01-05 06:00:00 6.92 NA
What I would like to get is something like :
df_int
# A tibble: 5 x 5
date d_0.9 d_1 d_2 d_2.5
<dttm> <dbl> <dbl> <dbl> <dbl>
1 2004-01-05 03:00:00 7 7.0625 7.6875 8
2 2004-01-05 04:00:00 7.5 7.59375 8.53125 9
3 2004-01-05 05:00:00 7 7.0625 7.6875 8
4 2004-01-05 06:00:00 6.92 NA NA NA
I have to do this for a very large data frame. Is there an efficient way of doing it?
Many thanks in advance
One option is to convert the data to long format, use a join to add rows for the depths we want to interpolate at, and then use approx
for the interpolation:
library(tidyverse)
# Data
df = tibble(date=seq(as.POSIXct("2004-01-05 03:00:00"),
as.POSIXct("2004-01-05 06:00:00"),
by="1 hour"),
d_0.9 = c(7,7.5,7,6.92),
d_2.5 = c(8,NA,8,NA),
d_5.0 = c(10,10.5,9.4,NA))
# Create a data frame with all of the times and depths we want to interpolate at
depths = sort(unique(c(c(0.9, 2.5, 5), seq(ceiling(0.9), floor(5), 1))))
depths = crossing(date=unique(df$date), depth = depths)
# Convert data to long format, join to add interpolation depths, then interpolate
df.interp = df %>%
gather(depth, value, -date) %>%
mutate(depth = as.numeric(gsub("d_", "", depth))) %>%
full_join(depths) %>%
arrange(date, depth) %>%
group_by(date) %>%
mutate(value.interp = if(length(na.omit(value)) > 1) {
approx(depth, value, xout=depth)$y
} else {
value
})
In the code above, the if
statement is inclduded to prevent approx
throwing an error when a given date
has only one non-missing value.
df.interp
date depth value value.interp 1 2004-01-05 03:00:00 0.9 7.00 7.000000 2 2004-01-05 03:00:00 1.0 NA 7.062500 3 2004-01-05 03:00:00 2.0 NA 7.687500 4 2004-01-05 03:00:00 2.5 8.00 8.000000 5 2004-01-05 03:00:00 3.0 NA 8.400000 6 2004-01-05 03:00:00 4.0 NA 9.200000 7 2004-01-05 03:00:00 5.0 10.00 10.000000 8 2004-01-05 04:00:00 0.9 7.50 7.500000 9 2004-01-05 04:00:00 1.0 NA 7.573171 10 2004-01-05 04:00:00 2.0 NA 8.304878 11 2004-01-05 04:00:00 2.5 NA 8.670732 12 2004-01-05 04:00:00 3.0 NA 9.036585 13 2004-01-05 04:00:00 4.0 NA 9.768293 14 2004-01-05 04:00:00 5.0 10.50 10.500000 15 2004-01-05 05:00:00 0.9 7.00 7.000000 16 2004-01-05 05:00:00 1.0 NA 7.062500 17 2004-01-05 05:00:00 2.0 NA 7.687500 18 2004-01-05 05:00:00 2.5 8.00 8.000000 19 2004-01-05 05:00:00 3.0 NA 8.280000 20 2004-01-05 05:00:00 4.0 NA 8.840000 21 2004-01-05 05:00:00 5.0 9.40 9.400000 22 2004-01-05 06:00:00 0.9 6.92 6.920000 23 2004-01-05 06:00:00 1.0 NA NA 24 2004-01-05 06:00:00 2.0 NA NA 25 2004-01-05 06:00:00 2.5 NA NA 26 2004-01-05 06:00:00 3.0 NA NA 27 2004-01-05 06:00:00 4.0 NA NA 28 2004-01-05 06:00:00 5.0 NA NA
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.