简体   繁体   English

在R中,将数据框对角线转换为行

[英]In R, convert data frame diagonals to rows

I'm developing a model that forecasts completed fertility for an age cohort. 我正在开发一个模型,预测一个年龄组的生育能力。 I currently have a data frame like this, where the rows are ages and the columns are years. 我目前有一个这样的数据框,其中行是年龄,列是年。 The value in each cell is age-specific fertility for that year: 每个细胞的价值是该年度的特定年龄生育率:

> df1
   iso3    sex age fert1953 fert1954 fert1955
14  AUS female  13    0.000  0.00000  0.00000
15  AUS female  14    0.000  0.00000  0.00000
16  AUS female  15   13.108 13.42733 13.74667
17  AUS female  16   26.216 26.85467 27.49333
18  AUS female  17   39.324 40.28200 41.24000

However, what I want is each row to be a cohort. 但是,我想要的是每一行都是一个队列。 Because the rows and columns represent individual years, the cohort data can be obtained by getting the diagonal. 因为行和列表示各个年份,所以可以通过获得对角线来获得群组数据。 I'm looking for a result like this: 我正在寻找这样的结果:

> df2
   iso3    sex ageIn1953 fert1953  fert1954  fert1955
14  AUS female        13    0.000   0.00000  13.74667
15  AUS female        14    0.000  13.42733  27.49333
16  AUS female        15   13.108  26.85467  41.24000
17  AUS female        16   26.216  40.28200  [data..] 
18  AUS female        17   39.324  [data..]  [data..] 

Here's the df1 data frame: 这是df1数据框:

df1 <- structure(list(iso3 = c("AUS", "AUS", "AUS", "AUS", "AUS"), sex = c("female", 
"female", "female", "female", "female"), age = c(13, 14, 15, 
16, 17), fert1953 = c(0, 0, 13.108, 26.216, 39.324), fert1954 = c(0, 
0, 13.4273333333333, 26.8546666666667, 40.282), fert1955 = c(0, 
0, 13.7466666666667, 27.4933333333333, 41.24)), .Names = c("iso3", 
"sex", "age", "fert1953", "fert1954", "fert1955"), class = "data.frame", row.names = 14:18)

EDIT: 编辑:

Here's the solution I ultimately used. 这是我最终使用的解决方案。 It's based on David's answer, but I needed to do this for each level of iso3 . 它基于David的答案,但我需要为iso3每个级别执行此iso3

df.ls <- lapply(split(f3, f = f3$iso3), FUN = function(df1) {
  n <- ncol(df1) - 4
  temp <- mapply(function(x, y) lead(x, n = y), df1[, -seq_len(4)], seq_len(n))
  return(cbind(df1[seq_len(4)], temp))
})
f4 <- do.call("rbind", df.ls)

I haven't tested for speed, but data.table v1.9.5 , recently implemented a new (written in C) lead/lag function called shift 我没有测试的速度,但data.table v1.9.5 ,最近实施了一项新(用C语言编写)超前/滞后函数调用的shift

So for the columns you want to shift, you could potentially use it combined with mapply , for example 因此,对于要移动的列,您可以将其与mapply结合使用,例如

library(data.table)
n <- ncol(df1) - 4 # the number of years - 1
temp <- mapply(function(x, y) shift(x, n = y, type = "lead"), df1[, -seq_len(4)], seq_len(n))
cbind(df1[seq_len(4)], temp) # combining back with the unchanged columns
#    iso3    sex age fert1953 fert1954 fert1955
# 14  AUS female  13    0.000  0.00000 13.74667
# 15  AUS female  14    0.000 13.42733 27.49333
# 16  AUS female  15   13.108 26.85467 41.24000
# 17  AUS female  16   26.216 40.28200       NA
# 18  AUS female  17   39.324       NA       NA

Edit: You can easily install the development version of data.table from GitHub using 编辑:您可以使用GitHub轻松安装data.table的开发版本

library(devtools) 
install_github("Rdatatable/data.table", build_vignettes = FALSE)

Either way, if you want dplyr , here goes 无论哪种方式,如果你想要dplyr ,这里就是

library(dplyr)
n <- ncol(df1) - 4 # the number of years - 1
temp <- mapply(function(x, y) lead(x, n = y), df1[, -seq_len(4)], seq_len(n))
cbind(df1[seq_len(4)], temp)
#    iso3    sex age fert1953 fert1954 fert1955
# 14  AUS female  13    0.000  0.00000 13.74667
# 15  AUS female  14    0.000 13.42733 27.49333
# 16  AUS female  15   13.108 26.85467 41.24000
# 17  AUS female  16   26.216 40.28200       NA
# 18  AUS female  17   39.324       NA       NA

Here is a base R approach: 这是一个基础R方法:

df1[,5:ncol(df1)] <- mapply(function(x, y) {vec.list <- df1[-1:-y, x]
                       length(vec.list) <- nrow(df1)
                       vec.list},
                       x=5:ncol(df1), y=1:(ncol(df1)-4))
df1
#   iso3    sex age fert1953 fert1954 fert1955
#14  AUS female  13    0.000  0.00000 13.74667
#15  AUS female  14    0.000 13.42733 27.49333
#16  AUS female  15   13.108 26.85467 41.24000
#17  AUS female  16   26.216 40.28200       NA
#18  AUS female  17   39.324       NA       NA

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM