简体   繁体   English

R:如何从数据框中获取正确的乳胶回归表?

[英]R :How to get a proper latex regression table from a dataframe?

Consider the following example 考虑以下示例

inds <- c('var1','','var2','')
model1 <- c(10.2,0.00,0.02,0.3)
model2 <- c(11.2,0.01,0.02,0.023)

df = df=data.frame(inds,model1,model2)
df
 inds model1 model2
 var1  10.20 11.200
        0.00  0.010
 var2   0.02  0.020
        0.30  0.023

Here you have the output of a custom regression model with coefficients and P-values (I actually can show any other statistics if I need to, say, the standard errors of the coefficients). 在这里,您可以得到带有系数和P值的自定义回归模型的输出(实际上,如果需要说系数的标准误差,我可以显示任何其他统计信息)。

There are two variables, var1 and var2 . 有两个变量, var1var2

For instance, in model1, var1 comes with a coefficient of 10.2 and a P-value of 0.00 while var2 has a coefficient of 0.02 and a P-value of 0.30 . 例如,在model1中, var1的系数为10.2 ,P值为0.00var2的系数为0.02 ,P值为0.30

Is there a package that handle these (custom) tables automatically and can create a neat Latex table with stars for significance? 是否有一个程序包可以自动处理这些(自定义)表格,并可以创建一个带有星号的整洁的Latex表格以提高重要性?

Thanks! 谢谢!

Here is a solution using texreg . 这是使用texreg的解决方案。

Note that texreg >= 1.36.18 is required. 注意, texreg > = 1.36.18是必需的。

The information you are providing in the data frame (coefs and p-values) could be arranged in arbitrary ways in a data frame. 您在数据框中提供的信息(系数和p值)可以以任意方式在数据框中排列。 Therefore we need to write code that selects these data from the appropriate places in the data frame and uses them to create a texreg object. 因此,我们需要编写代码以从数据框中的适当位置选择这些数据,并使用它们来创建texreg对象。 As you are requesting a generic (and presumably re-usable) solution, we should wrap the code in a re-usable function. 当您请求通用(可能是可重用)的解决方案时,我们应该将代码包装在可重用函数中。 I'll call this function extractFromDataFrame . 我将这个函数extractFromDataFrame So here is the function, which extracts the information from the data frame and creates a list of texreg objects for the different models: 因此,这里是函数,该函数从数据框中提取信息并为不同模型创建texreg对象的列表:

require("texreg")

extractFromDataFrame <- function (dataFrame) {
  coef.row.indices <- seq(1, nrow(dataFrame) - 1, 2)
  pval.row.indices <- seq(2, nrow(dataFrame), 2)
  texregObjects <- list()
  for (i in 2:ncol(dataFrame)) {
    coefs <- dataFrame[coef.row.indices, i]
    coefnames <- as.character(dataFrame[coef.row.indices, 1])
    pvalues <- dataFrame[pval.row.indices, i]
    tr <- createTexreg(coef = coefs, coef.names = coefnames, pvalues = pvalues)
    texregObjects[i - 1] <- list(tr)
  }
  return(texregObjects)
}

In this function, we first define in which rows of the data frame the coefficients are stored and in which rows the p-values are stored. 在此函数中,我们首先定义系数存储在数据帧的哪些行中以及p值存储在哪些行中。 Then we created an empty list in which we stored the texreg objects. 然后,我们创建了一个空列表,用于存储texreg对象。 We iterate through all columns but the first as the first one contains only the labels. 我们遍历所有列,但第一列只包含标签。 In each of these model columns, we save the coefficients, their names, and the p-values, and then we hand them over to the createTexreg constructor, which is a function that creates a texreg object for us based on the data. 在每个模型列中,我们保存系数,它们的名称和p值,然后将它们移交给createTexreg构造函数,该函数是一个根据数据为我们创建texreg对象的函数。 We add the texreg object to the list. 我们将texreg对象添加到列表中。 In the end, we return the list of texreg objects. 最后,我们返回texreg对象的列表。

We can now apply the function to any data frame that looks like the one provided in the question, with arbitrary numbers of columns (> 1). 现在,我们可以将该函数应用于任何具有任意列数(> 1)的数据框,该数据框看起来像问题中提供的数据框一样。 In this case, after applying the function to the df object, we may want to print the contents of the list if we want to make sure that we did everything right: 在这种情况下,将函数应用于df对象后,如果我们想确保我们做对了所有事情,我们可能希望打印列表的内容:

tr <- extractFromDataFrame(df)
tr

And indeed, the results contain the relevant data: 实际上,结果包含相关数据:

[[1]]

No standard errors were defined for this texreg object.
No decimal places were defined for the GOF statistics.

     coef.   p
var1 10.20 0.0
var2  0.02 0.3

No GOF block defined.

[[2]]

No standard errors were defined for this texreg object.
No decimal places were defined for the GOF statistics.

     coef.     p
var1 11.20 0.010
var2  0.02 0.023

No GOF block defined.

Now we can simply hand the list of texreg objects over to screenreg , eg, screenreg(tr) , with the following result: 现在,我们可以简单的手的名单texreg对象到screenreg ,例如, screenreg(tr)结果如下:

========================
      Model 1    Model 2
------------------------
var1  10.20 ***  11.20 *
var2   0.02       0.02 *
========================
*** p < 0.001, ** p < 0.01, * p < 0.05

Or to htmlreg for creating an HTML table. 或使用htmlreg创建HTML表。 Or, as requested in the original question, to texreg for creating a LaTeX table. 或者,按照原始问题的要求,使用texreg创建LaTeX表。 The output of texreg(tr, single.row = TRUE) looks like this: texreg(tr, single.row = TRUE)的输出如下所示:

\begin{table}
\begin{center}
\begin{tabular}{l c c }
\hline
 & Model 1 & Model 2 \\
\hline
var1 & $10.20^{***}$ & $11.20^{*}$ \\
var2 & $0.02$        & $0.02^{*}$  \\
\hline
\multicolumn{3}{l}{\scriptsize{$^{***}p<0.001$, $^{**}p<0.01$, $^*p<0.05$}}
\end{tabular}
\caption{Statistical models}
\label{table:coefficients}
\end{center}
\end{table}

This solution can be modified to accommodate standard errors, confidence intervals, or goodness-of-fit statistics. 可以修改此解决方案以适应标准误差,置信区间或拟合优度统计。

Various texreg arguments can be used to customize the output, including the use of the booktabs package or decimal alignment via dcolumn , for example. 各种texreg参数可用于自定义输出,例如,包括使用booktabs包或通过dcolumn进行十进制对齐。

Please note that you should not call your data frame df because that object name is already defined in the stats package. 请注意,您不应调用数据框df因为该对象名称已在stats包中定义。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM