簡體   English   中英

如何將表格變成 R 中具有各種屬性的觀察列表?

[英]How do I turn a table into a list of observations with various attributes in R?

我正在開展一個項目,該項目模擬 1990 年至 1995 年間澳大利亞土著和非土著人在監獄中死亡的風險。 我有一個數據表,但我不知道如何將其轉換為所有觀察結果的列表。

這是我當前的代碼:

#DATA
years <- c(1990, 1991, 1992, 1993, 1994, 1995)
ind_pris <- c(2041, 2166, 2223, 2416, 2742, 2907)
ind_deaths <- c(6, 8, 2, 7, 11, 17)
ind_pop <- c(168317, 172462, 176827, 181341, 185836, 190438)
nonind_pris <- c(12264, 12855, 13336, 13450, 14302, 14501)
nonind_deaths <- c(27, 31, 34, 42, 42, 42)
nonind_pop <- c(13141817, 13326044, 13501987, 13649262, 13810095, 13995940)
all_data <- data.frame(years, ind_pris, ind_deaths, ind_pop, nonind_pris, nonind_deaths, nonind_pop)

我如何制作它以便我有 6 個列表(1990-1995 每年一個)並且每個列表都包含那一年的所有觀察結果? 例如1990年總人口13310134(土著人口168317,加上非土著人口13141817),對於每個人,應該記錄3個屬性:1)他們的土著身份,2)他們是否是在監獄中,以及 3) 如果他們在監獄中死亡。

使用by ,對於每一年,您可以為印度人和非印度人創建兩個用零填充的矩陣,其中nrow根據他們的人口。 然后實際上只是根據案例用帶有seq_len的列逐行填充列,假設表格數據框中的死亡是指監獄中的死亡。 給出一個"by" object,這與列表非常相似。 (注意,我使用names(all_data)[1] <- "year"重命名了你'years'變量,這樣看起來更好。)

all_list <- by(all_data, all_data$year, \(x) {
  M1 <- matrix(0, x$ind_pop, 3, dimnames=list(NULL, c('ind', 'prison', 'died')))
  M1[, 'ind'] <- 1
  M1[seq_len(x$ind_pris), 'prison'] <- 1
  M1[seq_len(x$ind_deaths), 'died'] <- 1
  M2 <- matrix(0, x$nonind_pop, 3, dimnames=list(NULL, c('ind', 'prison', 'died')))
  M2[, 'ind'] <- 0
  M2[seq_len(x$nonind_pris), 'prison'] <- 1
  M2[seq_len(x$nonind_deaths), 'died'] <- 1
  rbind.data.frame(M1, M2)
})

給予

str(all_list)
# List of 6
# $ 1990:'data.frame':  13310134 obs. of  3 variables:
# ..$ ind   : num [1:13310134] 1 1 1 1 1 1 1 1 1 1 ...
# ..$ prison: num [1:13310134] 1 1 1 1 1 1 1 1 1 1 ...
# ..$ died  : num [1:13310134] 1 1 1 1 1 1 0 0 0 0 ...
# $ 1991:'data.frame':  13498506 obs. of  3 variables:
# ..$ ind   : num [1:13498506] 1 1 1 1 1 1 1 1 1 1 ...
# ..$ prison: num [1:13498506] 1 1 1 1 1 1 1 1 1 1 ...
# ..$ died  : num [1:13498506] 1 1 1 1 1 1 1 1 0 0 ...
# $ 1992:'data.frame':  13678814 obs. of  3 variables:
# ..$ ind   : num [1:13678814] 1 1 1 1 1 1 1 1 1 1 ...
# ..$ prison: num [1:13678814] 1 1 1 1 1 1 1 1 1 1 ...
# ..$ died  : num [1:13678814] 1 1 0 0 0 0 0 0 0 0 ...
# $ 1993:'data.frame':  13830603 obs. of  3 variables:
# ..$ ind   : num [1:13830603] 1 1 1 1 1 1 1 1 1 1 ...
# ..$ prison: num [1:13830603] 1 1 1 1 1 1 1 1 1 1 ...
# ..$ died  : num [1:13830603] 1 1 1 1 1 1 1 0 0 0 ...
# $ 1994:'data.frame':  13995931 obs. of  3 variables:
# ..$ ind   : num [1:13995931] 1 1 1 1 1 1 1 1 1 1 ...
# ..$ prison: num [1:13995931] 1 1 1 1 1 1 1 1 1 1 ...
# ..$ died  : num [1:13995931] 1 1 1 1 1 1 1 1 1 1 ...
# $ 1995:'data.frame':  14186378 obs. of  3 variables:
# ..$ ind   : num [1:14186378] 1 1 1 1 1 1 1 1 1 1 ...
# ..$ prison: num [1:14186378] 1 1 1 1 1 1 1 1 1 1 ...
# ..$ died  : num [1:14186378] 1 1 1 1 1 1 1 1 1 1 ...
# - attr(*, "dim")= int 6
# - attr(*, "dimnames")=List of 1
# ..$ all_data$year: chr [1:6] "1990" "1991" "1992" "1993" ...
# - attr(*, "call")= language by.data.frame(data = all_data, INDICES = all_data$year, FUN = function(x) {     M1 <- matrix(0, x$ind_pop, 3, dim| __truncated__ ...
#                                                                                                                          - attr(*, "class")= chr "by"

object.size(all_list)
# 1980038856 bytes

數據:

all_data <- structure(list(years = c(1990, 1991, 1992, 1993, 1994, 1995), 
    ind_pris = c(2041, 2166, 2223, 2416, 2742, 2907), ind_deaths = c(6, 
    8, 2, 7, 11, 17), ind_pop = c(168317, 172462, 176827, 181341, 
    185836, 190438), nonind_pris = c(12264, 12855, 13336, 13450, 
    14302, 14501), nonind_deaths = c(27, 31, 34, 42, 42, 42), 
    nonind_pop = c(13141817, 13326044, 13501987, 13649262, 13810095, 
    13995940)), class = "data.frame", row.names = c(NA, -6L))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM