[英]R: using customised function in dplyr
Sample data: 样本数据:
library(tidyverse)
set.seed(123)
dat <- tibble(
year = rep(1980:2015, each = 100),
day = rep(200:299, times = 36),
rain = sample(0:17, size = 100*36,replace = T),
PETc = sample(rnorm(100*36)),
ini.t = rep(10:45, each = 100 ))
I have a function that operates on a DataFrame 我有一个在DataFrame上运行的函数
my.func <- function(df, initial, thres, upper.limit){
df$paw <- rep(NA, nrow(df))
df$aetc <- rep(NA, nrow(df))
df$sw <- rep(NA, nrow(df))
for(n in 1:nrow(df)){
df$paw[n] <- df$rain[n] + initial
df$aetc[n] <- ifelse(df$paw[n] >= thres, df$PETc[n], (df$paw[n]/thres) * df$PETc[n])
df$aetc[n] <- ifelse(df$aetc[n] > df$paw[n], df$paw[n], df$aetc[n])
df$sw[n] <- initial + df$rain[n] - df$aetc[n]
df$sw[n] <- ifelse(df$sw[n] > upper.limit,upper.limit,ifelse(df$sw[n] < 0, 0,df$sw[n]))
initial <- df$sw[n]
}
return(df)
}
thres <- 110
upper.limit <- 200
thres <- 110
upper.limit <- 200
Applying the above function for a single year: 将上述功能应用一年:
dat.1980 <- dat[dat$year == 1980,]
my.func(dat.1980, initial = dat.1980$ini.t[1], thres, upper.limit)
How do I apply this function to each year. 每年如何应用此功能。 I thought of using dplyr 我想到了使用dplyr
dat %>% group_by(year)%>% run my function on each year.
Also since there are 35 years, there will be 35 dataframes returned. 同样,由于存在35年,因此将返回35个数据帧。 How do I return the bind these data frame row wise? 如何将这些数据帧按行返回绑定?
You were on the right track. 您在正确的轨道上。 do
lets you perform functions by group. do
可以按组执行功能。
dat %>%
group_by(year) %>%
do(my.func(., initial = head(.$ini.t, 1), thres, upper.limit))
# Groups: year [36]
# year day rain PETc ini.t paw aetc sw
# <int> <int> <int> <dbl> <int> <dbl> <dbl> <dbl>
# 1 1980 200 5 0.968 10 15.0 0.132 14.9
# 2 1980 201 14 0.413 10 28.9 0.108 28.8
# 3 1980 202 7 -0.912 10 35.8 -0.296 36.1
# 4 1980 203 15 -0.337 10 51.1 -0.156 51.2
# 5 1980 204 16 0.412 10 67.2 0.252 67.0
# 6 1980 205 0 -0.923 10 67.0 -0.562 67.5
# 7 1980 206 9 1.17 10 76.5 0.813 75.7
# 8 1980 207 16 0.0542 10 91.7 0.0452 91.7
# 9 1980 208 9 -0.293 10 101 -0.268 101
# 10 1980 209 8 0.0788 10 109 0.0781 109
# ... with 3,590 more rows
purrr::map
functions are the du jour method but I think in this case it's a stylistic choice purrr::map
函数是du jour方法,但我认为在这种情况下,这是一种风格选择
We can split
by 'year' and then use map
to apply the my.func
to each of the split datasets in the list
我们可以split
的“年”,然后使用map
到应用my.func
到每个分割数据集的list
library(purrr)
dat %>%
split(.$year) %>%
map_df(~my.func(.x, initial = .x$ini.t[1], thres, upper.limit))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.