如何在組拆分產生的矩陣列表中重新綁定、排列和格式化數據

Question

我有一個matrices list ，顯示了從前一個group_split()得到的描述性分析的結果。

我想做的是在函數解決方案的幫助下使用rbind()堆疊相應的matrices ，該解決方案允許迭代選擇相應的matrices 、rbinding 和格式化它們（即設置行名、列名和單獨的行順序）。 最后一步是使用kableExtra打印包含描述性結果的matrices 。

我的問題：在 for 循環中使用rbind()來綁定和迭代相應的矩陣三元組以對它們進行rbind只為最后一個三元組生成所需的 output，但不是為所有三元組生成所需的 output。 也許你們中的某個人知道我哪里出錯了。 我在這里咨詢過類似的問題，但沒有找到任何解決我問題的方法。

這是使用tidyverse和kableExtra package 環境的示例

# Some random data for an initial df
city <- rep(c(1:3), each = 4) %>% factor () # this is the splitting variable
gender <- rep(c("m", "f", "m", "f", "m", "f", "m", "f", "m", "f", "m", "f")) %>% factor () # this is a factor for a later subgrouping analysis
age <- c(32, 54, 67, 35, 19, 84, 34, 46, 67, 41, 20, 75)
working_yrs <- c(16, 27, 39, 16, 2, 50, 16, 23, 48, 21, 0, 57)
income <- (working_yrs)*50

df <- data.frame(city, gender, age, working_yrs, income)

cities <- city %>% levels () %>% c () # vector needed later for a for loop


# Group splits by city (dfs -> list of lists)
df1 <- select(df, -gender) %>% 
  group_split (city, keep=FALSE)

df2 <- select (df, -income) %>%
  filter(str_detect(gender, "m")) %>% 
  select (city, age, working_yrs) %>%
  group_split (city, keep = FALSE)

df3 <- select (df, -income) %>%
  filter(str_detect(gender, "f")) %>% 
  select (city, age, working_yrs) %>%
  group_split (city, keep = FALSE)

LOL <- c(df1, df2, df3) # list of lists


# Define function for descriptive analysis (list of lists -> list of matrices)
fun_descr <- function(x) {
  c(n=sum(!is.na(x)),
    Percent=((sum(!is.na(x)))/(sum(!is.na(x)) + sum(is.na(x)))*100),
    Mean=mean(x, na.rm = TRUE),
    SD=sd(x, na.rm = TRUE),
    Median=median(x, na.rm = TRUE),
    Quantile=quantile(x, 0.25, na.rm = TRUE),
    Quantile=quantile(x, 0.75, na.rm = TRUE))
}

LOM <- lapply (LOL, function (x) {
  t(apply(x, 2, fun_descr)) %>% round(digits = 1)
})

到目前為止一切順利，現在問題來了。 我對屬於同一城市的rbind()對應矩陣三元組的方法僅返回最后一個城市的正確結果。


for (i in 1:length(cities)) {
  bindcity <- rbind(LOM[[i]], LOM[[i+length(cities)]], LOM[[i+(length(cities)*2)]])
}

bindcity

如果for循環或lapply解決方案正常工作，返回 rbound matrices列表，我希望將結果matrices list的行和列格式化如下。 不幸的是，由於上一步沒有按預期工作，我還不能測試它。 我仍在努力為這個 function 找到第一行，按以下行順序 1、4、6、2、5、7、3 對每個矩陣的行進行排序，以便數據與下面顯示的行名匹配。

nicematrices <- lapply (bindcity, function (x) {
  rownames(x) <- paste(list("Age", "Working years", "Age (male)", "Working years (male)", "Age (female)", "Working years (female)", "Income"))
  colnames(x) <- paste(list("n (valid)", "% (valid)", "Mean", "SD", "Median", "25% Quantile", "75% Quantile"))
  return(x)
})

最后一步：使用kableExtra打印matrices

for (i in 1:length(nicematrices)) {
print(
  kable(nicematrices[[i]], caption = "Title") %>%
    column_spec(1, bold = T) %>%
    kable_styling("striped", bootstrap_options = "hover", full_width = TRUE)
)}

Answer 1

我不知道我是否理解正確，但您是否嘗試在 bindcity 中添加您的 i 索引？

for (i in 1:length(cities)) {
  bindcity[[i]] <- rbind(LOM[[i]], LOM[[i+length(cities)]], LOM[[i+(length(cities)*2)]])
}

您的問題可能是您的循環確實經歷了所有迭代，但如果您不能確保每個 i 都保存 output，則只保存最后一個迭代。 如果您要遵循這種方式，您還需要在循環之前啟動 bindcity。 全面的：

bindcity <- c()

for (i in 1:length(cities)) {
  bindcity[[i]] <- rbind(LOM[[i]], LOM[[i+length(cities)]], LOM[[i+(length(cities)*2)]])
}

以下是上述返回的內容：

> bindcity

[[1]]
            n Percent   Mean    SD Median Quantile.25% Quantile.75%
age         4     100   47.0  16.5   44.5         34.2         57.2
working_yrs 4     100   24.5  11.0   21.5         16.0         30.0
income      4     100 1225.0 548.5 1075.0        800.0       1500.0
age         2     100   49.5  24.7   49.5         40.8         58.2
working_yrs 2     100   27.5  16.3   27.5         21.8         33.2
age         2     100   44.5  13.4   44.5         39.8         49.2
working_yrs 2     100   21.5   7.8   21.5         18.8         24.2

[[2]]
            n Percent   Mean     SD Median Quantile.25% Quantile.75%
age         4     100   45.8   27.8   40.0         30.2         55.5
working_yrs 4     100   22.8   20.2   19.5         12.5         29.8
income      4     100 1137.5 1007.8  975.0        625.0       1487.5
age         2     100   26.5   10.6   26.5         22.8         30.2
working_yrs 2     100    9.0    9.9    9.0          5.5         12.5
age         2     100   65.0   26.9   65.0         55.5         74.5
working_yrs 2     100   36.5   19.1   36.5         29.8         43.2

[[3]]
            n Percent   Mean     SD Median Quantile.25% Quantile.75%
age         4     100   50.8   25.1   54.0         35.8         69.0
working_yrs 4     100   31.5   26.0   34.5         15.8         50.2
income      4     100 1575.0 1299.0 1725.0        787.5       2512.5
age         2     100   43.5   33.2   43.5         31.8         55.2
working_yrs 2     100   24.0   33.9   24.0         12.0         36.0
age         2     100   58.0   24.0   58.0         49.5         66.5
working_yrs 2     100   39.0   25.5   39.0         30.0         48.0

Answer 2

下面使用lapply循環來獲得所需的綁定矩陣和 Kable output。

bindcity <- lapply(seq_along(cities), function(i){
  rbind(LOM[[i]], LOM[[i+length(cities)]], LOM[[i+(length(cities)*2)]])
})

nicematrices <- lapply(bindcity, function (x) {
  rownames(x) <- c("Age", "Working years", "Income", "Age (male)", "Working years (male)", "Age (female)", "Working years (female)")
  colnames(x) <- c("n (valid)", "% (valid)", "Mean", "SD", "Median", "25% Quantile", "75% Quantile")
  x
})

上面的兩個循環可以簡化。 但是，以下lapply循環不會創建bindcity列表。 這僅在之后使用此列表時才重要，這在問題中並不清楚。 它不用於創建 Kable 表。

nicematrices <- lapply(seq_along(cities), function (i) {
  x <- rbind(LOM[[i]], LOM[[i+length(cities)]], LOM[[i+(length(cities)*2)]])
  rownames(x) <- c("Age", "Working years", "Income", "Age (male)", "Working years (male)", "Age (female)", "Working years (female)")
  colnames(x) <- c("n (valid)", "% (valid)", "Mean", "SD", "Median", "25% Quantile", "75% Quantile")
  x
})

現在為 Kable 表。

library(kableExtra)

kbl_list <- lapply(nicematrices, function(x){
  kbl <- kable(x, caption = "Title") %>%
    column_spec(1, bold = TRUE) %>%
    kable_styling("striped", 
                  bootstrap_options = "hover",
                  full_width = TRUE)
  print(kbl)
})

如何在組拆分產生的矩陣列表中重新綁定、排列和格式化數據

問題描述

2 個解決方案

解決方案1
2 2020-04-16 16:02:48

解決方案2
2 已采納 2020-04-16 17:04:52

如何在組拆分產生的矩陣列表中重新綁定、排列和格式化數據

問題描述

2 個解決方案

解決方案1 2 2020-04-16 16:02:48

解決方案2 2 已采納 2020-04-16 17:04:52

解決方案1
2 2020-04-16 16:02:48

解決方案2
2 已采納 2020-04-16 17:04:52