[英]Descriptive tables - how to create a table containing both numeric and categorical variables
I can't find a really intuitive way of doing the most basic thing; 我找不到最直观的方式来做最基本的事情; creating a summary table with my base variables.
用我的基本变量创建一个汇总表。 The best method I've found is currently using tapply:
我发现的最好方法是当前使用tapply:
seed(200)
my_stats <- function(x){
if (is.factor(x)){
a <- table(x, useNA="no")
b <- round(a*100/sum(a),2)
# If binary
if (length(a) == 2){
ret <- paste(a[1], " (", b[1], " %)", sep="")
}
return(ret)
}else{
ret <- mean(x, na.rm=T)
if (ret < 1){
ret <- round(ret, 2)
}else{
ret <- round(ret)
}
return(ret)
}
}
library(rms)
groups <- factor(sample(c("Group A","Group B"), size=51, replace=T))
a <- 3:53
b <- rnorm(51)
c <- factor(sample(c("male","female"), size=51, replace=T))
res <- rbind(a=tapply(a, groups, my_stats),
b=tapply(b, groups, my_stats),
c=tapply(c, groups, my_stats))
latex(latexTranslate(res))
The res contains: 资源包含:
> res
Group A Group B
a "28" "28"
b "-0.08" "-0.21"
c "14 (56 %)" "14 (53.85 %)"
Now this works but it seems very complex and not the most elegant solution. 现在可以使用,但似乎非常复杂,而不是最优雅的解决方案。 I've tried to search for how to create descriptive tables but the all focus on the table(), prop.table(), summary() for just single variable or variables of the same kind.
我试图搜索如何创建描述性表,但是所有这些都只针对单个变量或同类变量的table(),prop.table(),summary()。
My question: Is there a package/function that allows an easy way of creating a good-looking latex table? 我的问题:是否有一个软件包/功能可以轻松创建美观的乳胶表? If so, please give a hint of how to get the above result.
如果是这样,请提示如何获得上述结果。
Thanks! 谢谢!
If you would like to create a summary table with both catergorical and continuous variables you should look into the package 'tableone'. 如果要创建包含分类变量和连续变量的汇总表,则应查看“ tableone”包。
Here is an example of what it can do https://rpubs.com/kaz_yos/tableone-vignette . 这是它可以做什么的示例https://rpubs.com/kaz_yos/tableone-vignette 。 And here is the pdf documentation: https://cran.r-project.org/web/packages/tableone/tableone.pdf
这是pdf文档: https : //cran.r-project.org/web/packages/tableone/tableone.pdf
I hope this helps. 我希望这有帮助。
What you're asking is a tad open ended, since there's the distinct possibility that you will disagree with me on what constitutes a "good-looking LaTeX table". 您要问的是开放式的,因为您很可能在构成“漂亮的LaTeX表”方面与我意见不一致。
For instance, I would probably prefer to organize this by row, rather than by column: 例如,我可能更喜欢按行而不是按列进行组织:
require(plyr)
require(xtable)
dat <- data.frame(a,b,c,groups)
xtable(ddply(dat,.(groups),summarise,a = my_stats(a),
b = my_stats(b),
c = my_stats(c)))
\begin{table}[ht]
\begin{center}
\begin{tabular}{rlrrl}
\hline
& groups & a & b & c \\
\hline
1 & Group A & 28.00 & 0.14 & 13 (52 \%) \\
2 & Group B & 28.00 & -0.00 & 13 (50 \%) \\
\hline
\end{tabular}
\end{center}
\end{table}
And of course, much of that is customizable if you look at ?xtable
and also ?print.xtable
. 当然,如果您同时查看
?xtable
和?print.xtable
,其中的大部分内容都是可自定义的。
If you rewrite your function so that it always returns a string (it sometimes returns a string, sometimes a number, sometimes NULL), you can call ddply
on the data.frame, without having to specify all the columns. 如果重写函数以使其始终返回一个字符串(有时返回一个字符串,有时返回一个数字,有时返回NULL),则可以在
ddply
上调用ddply,而不必指定所有列。
f <- function(u) {
res <- "?"
if(is.factor(u) || is.character(u)) {
u <- table(u, useNA = "no")
if (length(u) == 0 || sum(u) == 0) { res <- "NA" }
else { res <- sprintf( "%0.0f%%", 100 * u[1] / sum(u) ) }
} else {
u <- mean(u, na.rm=TRUE)
if(is.na(u)) { res <- "NA" }
else { res <- sprintf( ifelse( abs(u) < 1, "%0.2f", "%0.0f" ), u ) }
}
return( res )
}
# Same function, for data.frames
g <- function(d) do.call( data.frame, lapply(d, f) )
library(plyr)
ddply(data.frame(a,b,c), .(groups), g)
Since you want LaTeX tables, you may also want to try the following, which does not group the data, but adds sparkline histograms for the numeric variables. 由于您需要LaTeX表,因此您可能还想尝试以下方法,该方法不对数据进行分组,而是为数字变量添加迷你图直方图。
library(Hmisc)
latex(describe(d), file="")
查看tables
包,这可能会使此过程更简单。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.