[英]How to rbind, arrange and format data in a list of matrices resulting from a group split
I have a list
of matrices
showing the results of a descriptive analysis resulting from a previous group_split()
by a factor.我有一个
matrices
list
,显示了从前一个group_split()
得到的描述性分析的结果。
What I'd like to do is stacking corresponding matrices
using rbind()
with the help of a functional solution that allows for an iterating choice of corresponding matrices
, rbinding and formatting them (ie setting rownames, colnames, and individual order of rows).我想做的是在函数解决方案的帮助下使用
rbind()
堆叠相应的matrices
,该解决方案允许迭代选择相应的matrices
、rbinding 和格式化它们(即设置行名、列名和单独的行顺序)。 The final step is to print the matrices
containing the descriptive results using kableExtra
.最后一步是使用
kableExtra
打印包含描述性结果的matrices
。
My problem: Using rbind()
within a for loop to bind and iterate over the corresponding matrix triplets to rbind
them only produces the desired output for the last triplet, but not for all triplets.我的问题:在 for 循环中使用
rbind()
来绑定和迭代相应的矩阵三元组以对它们进行rbind
只为最后一个三元组生成所需的 output,但不是为所有三元组生成所需的 output。 Maybe someone of you has an idea of where I'm going wrong.也许你们中的某个人知道我哪里出错了。 I have consulted similar questions here but have not found any solution to my problem.
我在这里咨询过类似的问题,但没有找到任何解决我问题的方法。
Here is an example using a tidyverse
and kableExtra
package environment这是使用
tidyverse
和kableExtra
package 环境的示例
# Some random data for an initial df
city <- rep(c(1:3), each = 4) %>% factor () # this is the splitting variable
gender <- rep(c("m", "f", "m", "f", "m", "f", "m", "f", "m", "f", "m", "f")) %>% factor () # this is a factor for a later subgrouping analysis
age <- c(32, 54, 67, 35, 19, 84, 34, 46, 67, 41, 20, 75)
working_yrs <- c(16, 27, 39, 16, 2, 50, 16, 23, 48, 21, 0, 57)
income <- (working_yrs)*50
df <- data.frame(city, gender, age, working_yrs, income)
cities <- city %>% levels () %>% c () # vector needed later for a for loop
# Group splits by city (dfs -> list of lists)
df1 <- select(df, -gender) %>%
group_split (city, keep=FALSE)
df2 <- select (df, -income) %>%
filter(str_detect(gender, "m")) %>%
select (city, age, working_yrs) %>%
group_split (city, keep = FALSE)
df3 <- select (df, -income) %>%
filter(str_detect(gender, "f")) %>%
select (city, age, working_yrs) %>%
group_split (city, keep = FALSE)
LOL <- c(df1, df2, df3) # list of lists
# Define function for descriptive analysis (list of lists -> list of matrices)
fun_descr <- function(x) {
c(n=sum(!is.na(x)),
Percent=((sum(!is.na(x)))/(sum(!is.na(x)) + sum(is.na(x)))*100),
Mean=mean(x, na.rm = TRUE),
SD=sd(x, na.rm = TRUE),
Median=median(x, na.rm = TRUE),
Quantile=quantile(x, 0.25, na.rm = TRUE),
Quantile=quantile(x, 0.75, na.rm = TRUE))
}
LOM <- lapply (LOL, function (x) {
t(apply(x, 2, fun_descr)) %>% round(digits = 1)
})
So far so good, now here's the problem.到目前为止一切顺利,现在问题来了。 My approach to
rbind()
corresponding matrix triplets belonging to the same city returns proper results for the last city only.我对属于同一城市的
rbind()
对应矩阵三元组的方法仅返回最后一个城市的正确结果。
for (i in 1:length(cities)) {
bindcity <- rbind(LOM[[i]], LOM[[i+length(cities)]], LOM[[i+(length(cities)*2)]])
}
bindcity
If the for
loop or an lapply
solution worked correctly, returning a list of rbound matrices
, I would expect to be formatting the rows and cols of the resulting list
of matrices
as follows.如果
for
循环或lapply
解决方案正常工作,返回 rbound matrices
列表,我希望将结果matrices
list
的行和列格式化如下。 Unfortunately, since the previous step doesn't work as expected, I couldn't test it, yet.不幸的是,由于上一步没有按预期工作,我还不能测试它。 I'm still struggling to find a first line for this function sorting each matrix's rows in the following row order 1,4,6,2,5,7,3 so that the data match the rownames shown below.
我仍在努力为这个 function 找到第一行,按以下行顺序 1、4、6、2、5、7、3 对每个矩阵的行进行排序,以便数据与下面显示的行名匹配。
nicematrices <- lapply (bindcity, function (x) {
rownames(x) <- paste(list("Age", "Working years", "Age (male)", "Working years (male)", "Age (female)", "Working years (female)", "Income"))
colnames(x) <- paste(list("n (valid)", "% (valid)", "Mean", "SD", "Median", "25% Quantile", "75% Quantile"))
return(x)
})
Final step: Print matrices
using kableExtra
最后一步:使用
kableExtra
打印matrices
for (i in 1:length(nicematrices)) {
print(
kable(nicematrices[[i]], caption = "Title") %>%
column_spec(1, bold = T) %>%
kable_styling("striped", bootstrap_options = "hover", full_width = TRUE)
)}
I don't know if I understand correctly but have you tried adding your i index in the bindcity?我不知道我是否理解正确,但您是否尝试在 bindcity 中添加您的 i 索引?
for (i in 1:length(cities)) {
bindcity[[i]] <- rbind(LOM[[i]], LOM[[i+length(cities)]], LOM[[i+(length(cities)*2)]])
}
What could be your problem here is that your loop indeed goes through all the iterations but saves only the last one if you don't make sure that for every i it saves the output.您的问题可能是您的循环确实经历了所有迭代,但如果您不能确保每个 i 都保存 output,则只保存最后一个迭代。 You will also need to initiate the bindcity before the loop if you are to follow this way.
如果您要遵循这种方式,您还需要在循环之前启动 bindcity。 Overall:
全面的:
bindcity <- c()
for (i in 1:length(cities)) {
bindcity[[i]] <- rbind(LOM[[i]], LOM[[i+length(cities)]], LOM[[i+(length(cities)*2)]])
}
Here's what the above returns:以下是上述返回的内容:
> bindcity
[[1]]
n Percent Mean SD Median Quantile.25% Quantile.75%
age 4 100 47.0 16.5 44.5 34.2 57.2
working_yrs 4 100 24.5 11.0 21.5 16.0 30.0
income 4 100 1225.0 548.5 1075.0 800.0 1500.0
age 2 100 49.5 24.7 49.5 40.8 58.2
working_yrs 2 100 27.5 16.3 27.5 21.8 33.2
age 2 100 44.5 13.4 44.5 39.8 49.2
working_yrs 2 100 21.5 7.8 21.5 18.8 24.2
[[2]]
n Percent Mean SD Median Quantile.25% Quantile.75%
age 4 100 45.8 27.8 40.0 30.2 55.5
working_yrs 4 100 22.8 20.2 19.5 12.5 29.8
income 4 100 1137.5 1007.8 975.0 625.0 1487.5
age 2 100 26.5 10.6 26.5 22.8 30.2
working_yrs 2 100 9.0 9.9 9.0 5.5 12.5
age 2 100 65.0 26.9 65.0 55.5 74.5
working_yrs 2 100 36.5 19.1 36.5 29.8 43.2
[[3]]
n Percent Mean SD Median Quantile.25% Quantile.75%
age 4 100 50.8 25.1 54.0 35.8 69.0
working_yrs 4 100 31.5 26.0 34.5 15.8 50.2
income 4 100 1575.0 1299.0 1725.0 787.5 2512.5
age 2 100 43.5 33.2 43.5 31.8 55.2
working_yrs 2 100 24.0 33.9 24.0 12.0 36.0
age 2 100 58.0 24.0 58.0 49.5 66.5
working_yrs 2 100 39.0 25.5 39.0 30.0 48.0
The following uses lapply
loops to get the desired binded matrices and the Kable output.下面使用
lapply
循环来获得所需的绑定矩阵和 Kable output。
bindcity <- lapply(seq_along(cities), function(i){
rbind(LOM[[i]], LOM[[i+length(cities)]], LOM[[i+(length(cities)*2)]])
})
nicematrices <- lapply(bindcity, function (x) {
rownames(x) <- c("Age", "Working years", "Income", "Age (male)", "Working years (male)", "Age (female)", "Working years (female)")
colnames(x) <- c("n (valid)", "% (valid)", "Mean", "SD", "Median", "25% Quantile", "75% Quantile")
x
})
The two loops above can be simplified.上面的两个循环可以简化。 However, the following
lapply
loop will not create the bindcity
list.但是,以下
lapply
循环不会创建bindcity
列表。 This is only important if this list is used after, which is not clear from the question.这仅在之后使用此列表时才重要,这在问题中并不清楚。 It is not used to create the Kable tables.
它不用于创建 Kable 表。
nicematrices <- lapply(seq_along(cities), function (i) {
x <- rbind(LOM[[i]], LOM[[i+length(cities)]], LOM[[i+(length(cities)*2)]])
rownames(x) <- c("Age", "Working years", "Income", "Age (male)", "Working years (male)", "Age (female)", "Working years (female)")
colnames(x) <- c("n (valid)", "% (valid)", "Mean", "SD", "Median", "25% Quantile", "75% Quantile")
x
})
Now for the Kable tables.现在为 Kable 表。
library(kableExtra)
kbl_list <- lapply(nicematrices, function(x){
kbl <- kable(x, caption = "Title") %>%
column_spec(1, bold = TRUE) %>%
kable_styling("striped",
bootstrap_options = "hover",
full_width = TRUE)
print(kbl)
})
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.