简体   繁体   English

描述性表-如何创建同时包含数字变量和分类变量的表

[英]Descriptive tables - how to create a table containing both numeric and categorical variables

I can't find a really intuitive way of doing the most basic thing; 我找不到最直观的方式来做最基本的事情; creating a summary table with my base variables. 用我的基本变量创建一个汇总表。 The best method I've found is currently using tapply: 我发现的最好方法是当前使用tapply:

seed(200)
my_stats <- function(x){
    if (is.factor(x)){
        a <- table(x, useNA="no")
        b <- round(a*100/sum(a),2)

        # If binary
        if (length(a) == 2){
            ret <- paste(a[1], " (", b[1], " %)", sep="")
        }
        return(ret)
    }else{
        ret <- mean(x, na.rm=T)
        if (ret < 1){
            ret <- round(ret, 2)
        }else{
            ret <- round(ret)
        }
        return(ret)
    }
}

library(rms)
groups <- factor(sample(c("Group A","Group B"), size=51, replace=T))
a <- 3:53 
b <- rnorm(51)
c <- factor(sample(c("male","female"), size=51, replace=T))

res <- rbind(a=tapply(a, groups, my_stats),
      b=tapply(b, groups, my_stats),
      c=tapply(c, groups, my_stats))
latex(latexTranslate(res))

The res contains: 资源包含:

> res
  Group A     Group B       
a "28"        "28"          
b "-0.08"     "-0.21"       
c "14 (56 %)" "14 (53.85 %)"

Now this works but it seems very complex and not the most elegant solution. 现在可以使用,但似乎非常复杂,而不是最优雅的解决方案。 I've tried to search for how to create descriptive tables but the all focus on the table(), prop.table(), summary() for just single variable or variables of the same kind. 我试图搜索如何创建描述性表,但是所有这些都只针对单个变量或同类变量的table(),prop.table(),summary()。

My question: Is there a package/function that allows an easy way of creating a good-looking latex table? 我的问题:是否有一个软件包/功能可以轻松创建美观的乳胶表? If so, please give a hint of how to get the above result. 如果是这样,请提示如何获得上述结果。

Thanks! 谢谢!

If you would like to create a summary table with both catergorical and continuous variables you should look into the package 'tableone'. 如果要创建包含分类变量和连续变量的汇总表,则应查看“ tableone”包。

Here is an example of what it can do https://rpubs.com/kaz_yos/tableone-vignette . 这是它可以做什么的示例https://rpubs.com/kaz_yos/tableone-vignette And here is the pdf documentation: https://cran.r-project.org/web/packages/tableone/tableone.pdf 这是pdf文档: https : //cran.r-project.org/web/packages/tableone/tableone.pdf

I hope this helps. 我希望这有帮助。

  • Mike 麦克风

What you're asking is a tad open ended, since there's the distinct possibility that you will disagree with me on what constitutes a "good-looking LaTeX table". 您要问的是开放式的,因为您很可能在构成“漂亮的LaTeX表”方面与我意见不一致。

For instance, I would probably prefer to organize this by row, rather than by column: 例如,我可能更喜欢按行而不是按列进行组织:

require(plyr)
require(xtable)
dat <- data.frame(a,b,c,groups)
xtable(ddply(dat,.(groups),summarise,a = my_stats(a),
                                     b = my_stats(b),
                                     c = my_stats(c)))


\begin{table}[ht]
\begin{center}
\begin{tabular}{rlrrl}
  \hline
 & groups & a & b & c \\ 
  \hline
1 & Group A & 28.00 & 0.14 & 13 (52 \%) \\ 
  2 & Group B & 28.00 & -0.00 & 13 (50 \%) \\ 
   \hline
\end{tabular}
\end{center}
\end{table}

And of course, much of that is customizable if you look at ?xtable and also ?print.xtable . 当然,如果您同时查看?xtable?print.xtable ,其中的大部分内容都是可自定义的。

If you rewrite your function so that it always returns a string (it sometimes returns a string, sometimes a number, sometimes NULL), you can call ddply on the data.frame, without having to specify all the columns. 如果重写函数以使其始终返回一个字符串(有时返回一个字符串,有时返回一个数字,有时返回NULL),则可以在ddply上调用ddply,而不必指定所有列。

f <- function(u) {
  res <- "?" 
  if(is.factor(u) || is.character(u)) {
    u <- table(u, useNA = "no")
    if (length(u) == 0 || sum(u) == 0) { res <- "NA" }
    else { res <- sprintf( "%0.0f%%", 100 * u[1] / sum(u) ) }
  } else {
    u <- mean(u, na.rm=TRUE)
    if(is.na(u)) { res <- "NA" }
    else { res <- sprintf( ifelse( abs(u) < 1, "%0.2f", "%0.0f" ), u ) }
  }
  return( res )
}
# Same function, for data.frames
g <- function(d) do.call( data.frame, lapply(d, f) )

library(plyr)
ddply(data.frame(a,b,c), .(groups), g)

Since you want LaTeX tables, you may also want to try the following, which does not group the data, but adds sparkline histograms for the numeric variables. 由于您需要LaTeX表,因此您可能还想尝试以下方法,该方法不对数据进行分组,而是为数字变量添加迷你图直方图。

library(Hmisc)
latex(describe(d), file="")

查看tables包,这可能会使此过程更简单。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何计算 R 中数值变量和分类变量的描述性统计量? - How to calculates the descriptive statistics for both numeric and categorical variables in R? 如何使用 Stargazer 为分类变量创建描述表? - How to create descriptive table with Stargazer for categorical variables? 如何获得连续变量和分类变量的描述性表? - How to get descriptive table for both continuous and categorical variables? 如何获得一个清晰的描述性统计表,显示按分类变量的结果分组的所有变量? - How to get a clean descriptive statistics table that displays all variables grouped by outcomes of a categorical variable? 如何使用2个类别变量和1个数字变量的信息创建新变量 - How to create new variables with information from 2 categorical and 1 numeric variable 如何为两个分类变量创建列联表(交叉表)? - How to create a contingency tables (crosstab) for two categorical variables? 如何在 Rstudio (stargazer) 中创建具有描述性统计数据的表? - How to create a table with descriptive statistics in Rstudio (stargazer)? 如何创建包含两个分类变量的频率表? - How can I create a frequency table with two categorical variables? 比较 expss 表中的两个变量(数字或两个因子) - Compare two variables (both numeric or both factors) in expss tables 如何将Excel中的分类字符串变量转换为数字变量? - How to convert categorical string variables in Excel to numeric variables?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM