[英]simulation in R with nested loops run slow
I am using R for agent-based historical simulation and the code works but slowly. 我正在使用R进行基于代理的历史模拟,并且代码可以正常运行,但是运行缓慢。 It loops through timesteps updating a dataframe of attributes of agents, and another with summary of overall state after each timestep (a generation).
它遍历时间步长,更新代理属性的数据框,并在每个时间步长(一代)后更新总体状态摘要。 Looping above that are a few runs of each different parameter setting.
上面的循环是每个不同参数设置的几次运行。 Though it begins with 100 agents, under extreme settings (high S, low A) after eg five generations the population can grow above a thousand.
尽管它以100个代理开始,但在极端情况下(高S,低A),例如经过5代后,人口可能增长到1000以上。 I read that updating a matrix is faster than dataframe so I converted summary to a matrix.
我读到更新矩阵要比数据帧更新快,所以我将汇总转换为矩阵。 But I also hear that vectorisation is best so before I change agents to matrix I wonder if anyone please could suggest a way to make it more vectorised?
但是我也听说矢量化是最好的,所以在我将代理更改为矩阵之前,我想知道是否有人可以建议一种使它更加矢量化的方法? Here is the code:
这是代码:
NextGeneration <- function(agent, N, S, A) {
# N is number of agents.
# S is probability that an agent with traditional fertility will have 2 sons surviving to the age of inheritance.
# A is probability that an heir experiencing division of estate changes his fertility preference from traditional to planned.
# find number of surviving heirs for each agent
excess <- runif(N) # get random numbers
heir <- rep(1, N) # everyone has at least 1 surviving heir
# if agent has traditional fertility 2 heirs may survive to inherit
heir[agent$fertility == "Trad" & excess < S] <- 2
# next generation more numerous if spare heirs survive
# new agents have vertical inheritance but also guided variation.
# first append to build a vector, then combine into new agent dataframe
nextgen.fertility <- NULL
nextgen.lineage <- NULL
for (i in 1:N) {
if (heir[i]==2) {
# two agents inherit from one parent.
for (j in 1:2) {
# A is probability of inheritance division event affecting fertility preference in new generation.
if (A > runif(1)) {
nextgen.fertility <- c(nextgen.fertility, "Plan")
} else {
nextgen.fertility <- c(nextgen.fertility, agent$fertility[i])
}
nextgen.lineage <- c(nextgen.lineage, agent$lineage[i])
}
} else {
nextgen.fertility <- c(nextgen.fertility, agent$fertility[i])
nextgen.lineage <- c(nextgen.lineage, agent$lineage[i])
}
}
# assemble new agent frame
nextgen.agent <- data.frame(nextgen.fertility, nextgen.lineage, stringsAsFactors = FALSE)
names(nextgen.agent) <- c("fertility", "lineage")
nextgen.agent
}
So the agents begin like this (Trad = traditional): 因此,代理开始是这样的(Trad =传统):
ID fertility lineage,
1 Trad 1
2 Trad 2
3 Trad 3
4 Trad 4
5 Trad 5
and after a few timesteps (generations) of random changes end up something like this: 经过几步(几代)的随机变化后,最终结果如下:
ID fertility lineage
1 Plan 1
2 Plan 1
3 Trad 2
4 Plan 3
5 Trad 3
6 Trad 4
7 Plan 4
8 Plan 4
9 Plan 4
10 Plan 5
11 Trad 5
Indeed, it would be more efficient to encode fertility
with 0 and 1, and you could even have an integer matrix. 确实,用0和1编码
fertility
会更有效,甚至可以使用整数矩阵。
Anyhow, the code as it stands can be simplified a lot - so here is a vectorized solution, still using your data.frame
: 无论如何,目前的代码可以大大简化-所以这是一个矢量化的解决方案,仍然使用您的
data.frame
:
NextGen <- function(agent, N, S, A) {
excess <- runif(N)
v1 <- which(agent$fertility == "Trad" & excess < S)
nextgen.agent <- agent[c(1:N, v1), ]
nextgen.agent[c(v1, seq.int(N+1, nrow(nextgen.agent))), "fertility"] <- ifelse(A > runif(length(v1)*2), "Plan", "Trad")
nextgen.agent
}
Testing with a sample agent
DF as follows: 使用样本
agent
DF进行以下测试:
agentDF <- data.frame(fertility = "Trad", lineage = 1:50, stringsAsFactors = FALSE)
# use microbenchmark library to compare performance
microbenchmark::microbenchmark(
base = {
res1 <- NextGeneration(agentDF, 50, 0.8, 0.8) # note I fixed the two variable typos in your function
},
new = {
res2 <- NextGen(agentDF, 50, 0.8, 0.8)
},
times = 100
)
## Unit: microseconds
## expr min lq mean median uq max neval
## base 1998.533 2163.8605 2446.561 2222.8200 2286.844 14413.173 100
## new 282.032 304.1165 329.552 320.3255 348.488 467.217 100
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.