# 如何生成包含所有相关小数位的汇总统计表以显示在 R 的结果表中？

[英]How to generate a summary statistics table with all relevant decimal places to appear in the resulting table in R?

I have an exceptionally large dataset (50+ Sites, 100+ Solutes) and I would like to quickly generate a summary table of descriptive statistics for the data and be able export it as a .csv file.我有一个非常大的数据集（50 多个站点，100 多个溶质），我想快速生成数据的描述性统计汇总表，并能够将其导出为 .csv 文件。

Sample code (a very small subset of my data):示例代码（我的数据的一个很小的子集）：

``````Site <- c( "SC2", "SC2" , "SC2", "SC3" , "SC3" ,"SC3", "SC4", "SC4" ,"SC4","SC4","SC4")
Aluminum <- as.numeric(c(0.0565,  0.0668 ,0.0785,0.0292,0.0576,0.075,0.029,0.088,0.076,0.007,0.107))
Antimony <- as.numeric(c(0.0000578,  0.0000698, 0.0000215,0.000025,0.0000389,0.0000785,0.0000954,0.00005447,0.00007843,0.000025,0.0000124))

stats_data <- data.frame(Site, Aluminum, Antimony, stringsAsFactors=FALSE)

stats_data_gather =stats_data %>% gather(Solute, value, -Site)

table_test = stats_data_gather %>%
group_by(Site, Solute) %>%
get_summary_stats(value, show = c("mean", "sd", "min", "q1", "median", "q3", "max"))
``````

This results in a dataframe that calculates the required statistics BUT, results are truncated to only three decimal places (ie what should be something like 0.00000057 appears as 0.000).这会产生一个计算所需统计数据的数据帧，但结果被截断到小数点后三位（即应该是 0.00000057 之类的东西显示为 0.000）。

I have tried variations of using:我尝试过使用的变体：

``````options(digits = XX),
format(DF, format = "e", digits = 2),
format.data.frame(table_test, digits = 8)
``````

I have tried these and other sample code found online but none will reproduce a summary dataframe that includes all necessary zeros for small number results (ie 0.00000057, not 0.000).我已经尝试过这些和在线找到的其他示例代码，但没有一个会重现一个汇总数据框，其中包含小数字结果（即 0.00000057，而不是 0.000）的所有必要零。 I would even be fine with scientific notation but I haven't been successful in finding an example that will work.我什至可以使用科学记数法，但我没有成功找到一个有效的例子。

This is my first post.这是我的第一篇文章。 I hope I have provided enough detail for help!我希望我提供了足够详细的帮助！ Thanks!谢谢！

It does not work because in `get_summary_stats` , it is hardcoded to return 3 digits:它不起作用，因为在`get_summary_stats` ，它被硬编码为返回 3 位数字：

``````get_summary_stats
function (data, ..., type = c("full", "common", "robust", "five_number",
"quantile", "mean", "median", "min", "max"), show = NULL,
probs = seq(0, 1, 0.25))
{
.....
dplyr::mutate_if(is.numeric, round, digits = 3)
if (!is.null(show)) {
show <- unique(c("variable", "n", show))
results <- results %>% select(!!!syms(show))
}
results
}
``````

You can either hack to code above, or for what you do, use a `summarise_all` function like below:您可以修改上面的代码，或者对于您所做的事情，使用如下所示的`summarise_all`函数：

``````library(dplyr)
library(tidyr)

stats_data_gather %>%  group_by(Site, Solute) %>% summarise_all(list(~mean(.),~sd(.),
~list(c(summary(.))))) %>% unnest_wider(list)

# A tibble: 6 x 10
# Groups:   Site [3]
Site  Solute    mean      sd    Min. `1st Qu.`  Median    Mean `3rd Qu.`
<chr> <chr>    <dbl>   <dbl>   <dbl>     <dbl>   <dbl>   <dbl>     <dbl>
1 SC2   Alumi… 6.73e-2 1.10e-2 5.65e-2 0.0616    6.68e-2 6.73e-2 0.0726
2 SC2   Antim… 4.97e-5 2.51e-5 2.15e-5 0.0000396 5.78e-5 4.97e-5 0.0000638
3 SC3   Alumi… 5.39e-2 2.31e-2 2.92e-2 0.0434    5.76e-2 5.39e-2 0.0663
4 SC3   Antim… 4.75e-5 2.78e-5 2.50e-5 0.0000320 3.89e-5 4.75e-5 0.0000587
5 SC4   Alumi… 6.14e-2 4.19e-2 7.00e-3 0.029     7.60e-2 6.14e-2 0.088
6 SC4   Antim… 5.31e-5 3.49e-5 1.24e-5 0.000025  5.45e-5 5.31e-5 0.0000784
# … with 1 more variable: Max. <dbl>
``````

The column names might be a bit bad, but you can easily rename them to q1 and q3.列名可能有点糟糕，但您可以轻松地将它们重命名为 q1 和 q3。

You can use `summary` function for the stats you are looking for:您可以对要查找的统计信息使用`summary`函数：

``````sum.table <- summary(stats_data_gather)
``````

Then you can take the summarized variables from 3rd column using:然后，您可以使用以下方法从第 3 列中获取汇总变量：

``````as.numeric(sub('.*:', '', sum.table[,3]))
``````