[英]Is there anyway to make this R code more efficient?
I'm applying a fertility model and to execute I need to save matrices for each childbearing intensity, which I called mati
, according to its order. 我正在应用生育力模型,要执行此操作,需要为每个生育强度保存矩阵,我将其称为mati
,根据其顺序。 In this case, i=(1, 2, 3,...n) The data frame bellow is an example of how my data is displayed. 在这种情况下,i =(1、2、3,... n)下面的数据框是如何显示我的数据的一个示例。 My real dataframe have 525 rows and 10 columns ( "AGE" "year" "mat1" "mat2" "mat3" "mat4" "mat5" "mat6" "mat7" "mat8"
). 我的实际数据帧有525行和10列( "AGE" "year" "mat1" "mat2" "mat3" "mat4" "mat5" "mat6" "mat7" "mat8"
)。
year <- c(rep(1998:2001, 4))
Age <- c(rep(15:18, 4))
mat1 <- c(rep(0.01, 16))
mat2 <- c(rep(0.012, 16))
mat3 <- c(rep(0.015, 16))
mat <- data.frame(year, Age, mat1, mat2, mat3)
mat
year Age mat1 mat2 mat3
1 1998 15 0.01 0.012 0.015
2 1999 16 0.01 0.012 0.015
3 2000 17 0.01 0.012 0.015
4 2001 18 0.01 0.012 0.015
5 1998 15 0.01 0.012 0.015
6 1999 16 0.01 0.012 0.015
7 2000 17 0.01 0.012 0.015
8 2001 18 0.01 0.012 0.015
9 1998 15 0.01 0.012 0.015
10 1999 16 0.01 0.012 0.015
11 2000 17 0.01 0.012 0.015
12 2001 18 0.01 0.012 0.015
13 1998 15 0.01 0.012 0.015
14 1999 16 0.01 0.012 0.015
15 2000 17 0.01 0.012 0.015
16 2001 18 0.01 0.012 0.015
To execute get my final numeric matrices I already executed the code below, but it takes a long time. 要执行以获得最终的数值矩阵,我已经执行了以下代码,但是需要很长时间。
##mat1###
library(dlyr)
library(tidyr)
mat1 <- #selecting just intensities of order 1 and creating matrices
select(mat, Age, year, mat1) %>%
spread(year, mat1)
names(mat1)[c(2:6)] <- paste0("year ", names(mat1[2:6])) #alter colnames
mat1[ ,1] <- paste0("age ", mat1[,1]) #alter the row from column "age"
mat_oe1 <- data.matrix(mat1[2:6])
dimnames(mat_oe1) <- list(c(mat1[,1]),
c(names(mat1[2:6])))
#Saving as txt to read i the model
write.table(mat_oe2, file = "mat_oe1.txt", sep = "\t",
row.names = T, col.names = T)
##mat2
mat2 <- #selecting just intensities of order 1 and creating matrices
select(mat, Age, year, mat2) %>%
spread(year, mat2)
names(mat2)[c(2:6)] <- paste0("year ", names(mat2[2:6])) #alter colnames
mat2[ ,1] <- paste0("age ", mat2[,1]) #alter the row from column "age"
mat_oe2 <- data.matrix(mat2[2:6])
dimnames(mat_oe2) <- list(c(mat1[,1]),
c(names(mat1[2:6])))
#Saving as txt to read i the model
write.table(mat_oe2, file = "mat_oe2.txt", sep = "\t",
row.names = T, col.names = T)
##mat3
mat3 <- #selecting just intensities of order 1 and creating matrices
select(mat, Age, year, mat3) %>%
spread(year, mat3)
names(mat3)[c(2:6)] <- paste0("year ", names(mat3[2:6])) #alter colnames
mat3[ ,1] <- paste0("age ", mat3[,1]) #alter the row from column "age"
mat_oe3 <- data.matrix(mat3[2:6])
dimnames(mat_oe3) <- list(c(mat3[,1]),
c(names(mat3[2:6])))
#Saving as txt to read i the model
write.table(mat_oe3, file = "mat_oe3.txt", sep = "\t",
row.names = T, col.names = T)
I'm using spread
because I need data with the format below: 我使用点spread
是因为我需要以下格式的数据:
mat1
1998 1999 2000 2001
15 0.01 0.01 0.01 0.01
16 0.01 0.01 0.01 0.01
17 0.01 0.01 0.01 0.01
18 0.01 0.01 0.01 0.01
I also started to write a loop, but I'm stuck in the first line already. 我也开始编写一个循环,但是我已经陷入了第一行。
mat_list <- list()
for(i in names(mat[,3:7])) {
mat_list[[i]] <- data.frame(
spread(
select(mat, AGE, year, mat[[paste0("mat",i)]]), year, mat[[paste0("mat", i)]]))
After applying the code above I've achieved the results below: 应用上面的代码后,我获得了以下结果:
view(mat1)
year 1998 year 1999 year 2000 year 2001
age 15 0.01 0.01 0.01 0.01
age 16 0.01 0.01 0.01 0.01
age 17 0.01 0.01 0.01 0.01
age 18 0.01 0.01 0.01 0.01
view(mat2)
year 1998 year 1999 year 2000 year 2001
age 15 0.012 0.012 0.012 0.012
age 16 0.012 0.012 0.012 0.012
age 17 0.012 0.012 0.012 0.012
age 18 0.012 0.012 0.012 0.012
view(mat3)
year 1998 year 1999 year 2000 year 2001
age 15 0.015 0.015 0.015 0.015
age 16 0.015 0.015 0.015 0.015
age 17 0.015 0.015 0.015 0.015
age 18 0.015 0.015 0.015 0.015
I believe you want to gather
and then spread
the data. 我相信您想gather
然后spread
数据。 This allows you to do everything in two steps. 这使您可以分两步执行所有操作。
library(dplyr)
library(tidyr)
mat %>%
gather(key, value, -year, -Age)%>%
spread(year, value)%>%
group_split(key)
[[1]]
# A tibble: 4 x 6
Age key `1998` `1999` `2000` `2001`
<int> <chr> <dbl> <dbl> <dbl> <dbl>
1 15 mat1 0.01 0.01 0.01 0.01
2 16 mat1 0.01 0.01 0.01 0.01
3 17 mat1 0.01 0.01 0.01 0.01
4 18 mat1 0.01 0.01 0.01 0.01
[[2]]
# A tibble: 4 x 6
Age key `1998` `1999` `2000` `2001`
<int> <chr> <dbl> <dbl> <dbl> <dbl>
1 15 mat2 0.012 0.012 0.012 0.012
2 16 mat2 0.012 0.012 0.012 0.012
3 17 mat2 0.012 0.012 0.012 0.012
4 18 mat2 0.012 0.012 0.012 0.012
[[3]]
# A tibble: 4 x 6
Age key `1998` `1999` `2000` `2001`
<int> <chr> <dbl> <dbl> <dbl> <dbl>
1 15 mat3 0.015 0.015 0.015 0.015
2 16 mat3 0.015 0.015 0.015 0.015
3 17 mat3 0.015 0.015 0.015 0.015
4 18 mat3 0.015 0.015 0.015 0.015
Or you can do it in base: 或者,您可以在基础中执行此操作:
mats <- reshape(data = data.frame(year = mat$year,Age = mat$Age, stack(mat, select = c('mat1', 'mat2', 'mat3')))
, idvar = c('Age', 'ind'), timevar = c('year'), direction = 'wide')
mat_list <- split(mats, mats$ind)
mat_list
$mat1
Age ind values.1998 values.1999 values.2000 values.2001
1 15 mat1 0.01 0.01 0.01 0.01
2 16 mat1 0.01 0.01 0.01 0.01
3 17 mat1 0.01 0.01 0.01 0.01
4 18 mat1 0.01 0.01 0.01 0.01
$mat2
Age ind values.1998 values.1999 values.2000 values.2001
17 15 mat2 0.012 0.012 0.012 0.012
18 16 mat2 0.012 0.012 0.012 0.012
19 17 mat2 0.012 0.012 0.012 0.012
20 18 mat2 0.012 0.012 0.012 0.012
$mat3
Age ind values.1998 values.1999 values.2000 values.2001
33 15 mat3 0.015 0.015 0.015 0.015
34 16 mat3 0.015 0.015 0.015 0.015
35 17 mat3 0.015 0.015 0.015 0.015
36 18 mat3 0.015 0.015 0.015 0.015
data I changed your data slightly so that each ID combination was unique. 数据我略微更改了您的数据,以便每个ID组合都是唯一的。
year <- rep(1998:2001, each = 4) #each was the change.
Age <- rep(15:18, 4)
mat1 <- rep(0.01, 16)
mat2 <- rep(0.012, 16)
mat3 <- rep(0.015, 16)
mat <- data.frame(year, Age, mat1, mat2, mat3)
To expand on Cole's answer. 扩大科尔的答案。
mat %>%
gather("mat", "val", -year, -Age) %>%
mutate(Age=paste("age",Age), year=paste("year",year)) %>%
group_by(mat) %>%
group_map(~spread(., year, val))
purrr::group_map applies a function to each group, and returns a list where each list element is the result of the function applied to each group. purrr :: group_map将函数应用于每个组,并返回一个列表,其中每个列表元素是应用于每个组的函数的结果。
# A tibble: 4 x 5
Age `year 1998` `year 1999` `year 2000` `year 2001`
<chr> <dbl> <dbl> <dbl> <dbl>
1 age 15 0.01 0.01 0.01 0.01
2 age 16 0.01 0.01 0.01 0.01
3 age 17 0.01 0.01 0.01 0.01
4 age 18 0.01 0.01 0.01 0.01
[[2]]
# A tibble: 4 x 5
Age `year 1998` `year 1999` `year 2000` `year 2001`
<chr> <dbl> <dbl> <dbl> <dbl>
1 age 15 0.012 0.012 0.012 0.012
2 age 16 0.012 0.012 0.012 0.012
3 age 17 0.012 0.012 0.012 0.012
4 age 18 0.012 0.012 0.012 0.012
[[3]]
# A tibble: 4 x 5
Age `year 1998` `year 1999` `year 2000` `year 2001`
<chr> <dbl> <dbl> <dbl> <dbl>
1 age 15 0.015 0.015 0.015 0.015
2 age 16 0.015 0.015 0.015 0.015
3 age 17 0.015 0.015 0.015 0.015
4 age 18 0.015 0.015 0.015 0.015
This was using the data slightly modified by Cole. 这是使用Cole稍作修改的数据。
year <- rep(1998:2001, each = 4) #each was the change.
Age <- rep(15:18, 4)
mat1 <- rep(0.01, 16)
mat2 <- rep(0.012, 16)
mat3 <- rep(0.015, 16)
mat <- data.frame(year, Age, mat1, mat2, mat3)
First reshape to long 第一次重塑到长
#add unique id to your data
mat$id=1:nrow(mat)
#reshape to long by mat
long1 = reshape_toLong(data = mat,id = "id",j = "all123",value.var.prefix = "mat")
#delet id column
long2=long1[,-1]
Second reshape to wide 第二次重塑
#reshape wide by year
wide=reshape_toWide(data = long2,id = "all123",j = "year",value.var.prefix = "mat")
Last get data 最后获取数据
mat1 垫1
wide[wide$all123==1,]
Age all123 mat1998 mat1999 mat2000 mat2001
1 15 1 0.01 0.01 0.01 0.01
4 16 1 0.01 0.01 0.01 0.01
8 17 1 0.01 0.01 0.01 0.01
12 18 1 0.01 0.01 0.01 0.01
mat2 垫2
wide[wide$all123==2,]
Age all123 mat1998 mat1999 mat2000 mat2001
3 15 2 0.012 0.012 0.012 0.012
5 16 2 0.012 0.012 0.012 0.012
7 17 2 0.012 0.012 0.012 0.012
11 18 2 0.012 0.012 0.012 0.012
mat3 垫3
wide[wide$all123==3,]
Age all123 mat1998 mat1999 mat2000 mat2001
2 15 3 0.015 0.015 0.015 0.015
6 16 3 0.015 0.015 0.015 0.015
9 17 3 0.015 0.015 0.015 0.015
10 18 3 0.015 0.015 0.015 0.015
before use reshape_toLong
and reshape_toWide
functions, you need to install onetree
package from my github yikeshu0611
using command below 在使用reshape_toLong
和reshape_toWide
函数之前,您需要使用以下命令从我的github yikeshu0611
安装onetree
包
devtools::install_github("yikeshu0611/onetree")
library(onetree)
Attention : The data you provided has problem, so I use data changed by Cole 注意 :您提供的数据有问题,所以我使用由科尔更改的数据
year <- rep(1998:2001, each = 4) #each was the change.
Age <- rep(15:18, 4)
mat1 <- rep(0.01, 16)
mat2 <- rep(0.012, 16)
mat3 <- rep(0.015, 16)
mat <- data.frame(year, Age, mat1, mat2, mat3)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.