简体   繁体   English

qwraps2 中的 summary_table 与 R 中的 group_by

[英]summary_table in qwraps2 with group_by in R

I am trying out the qwraps2 package and some of its functions.我正在试用 qwraps2 包及其一些功能。 In particular I am interested in the summary_table tool for output.我对用于输出的 summary_table 工具特别感兴趣。 I am using the iris data set for practice, but I noticed something strange when using group_by in the summary_table:我正在使用 iris 数据集进行练习,但是在 summary_table 中使用 group_by 时我注意到一些奇怪的事情:

library(datasets)
data("iris")
options(qwraps2_markup = "markdown")
our_summary1 <-
  list("Sepal Length" =
       list("min" = ~ min(iris$Sepal.Length),
            "max" = ~ max(iris$Sepal.Length),
            "mean (sd)" = ~ qwraps2::mean_sd(iris$Sepal.Length)),
       "Sepal Width" =
       list("min" = ~ min(iris$Sepal.Width),
            "median" = ~ median(iris$Sepal.Width),
            "max" = ~ max(iris$Sepal.Width),
            "mean (sd)" = ~ qwraps2::mean_sd(iris$Sepal.Width)),
       "Petal Length" =
       list("min" = ~ min(iris$Petal.Length),
            "max" = ~ max(iris$Petal.Length),
            "mean (sd)" = ~ qwraps2::mean_sd(iris$Sepal.Length)),
       "Petal Width" =
       list("min" = ~ min(iris$Petal.Width),
            "max" = ~ max(iris$Petal.Width),
            "mean (sd)" = ~ qwraps2::mean_sd(iris$Petal.Width)),
        "Species" =
       list("Setosa" = ~ qwraps2::n_perc0(iris$Species == "setosa"),
            "Versicolor"  = ~ qwraps2::n_perc0(iris$Species == "versicolor"),
            "Virginica"  = ~ qwraps2::n_perc0(iris$Species == "virginica"))
       )

bytype <- qwraps2::summary_table(dplyr::group_by(iris,Species),our_summary1)
bytype

The output i get is: output from the above code我得到的输出是:上面代码的输出

This doesnt make sense, it says that the statistics on different variables across different flower species are the same, which they are not.这是没有意义的,它说不同花种的不同变量的统计数据是相同的,但事实并非如此。 I cross checked this by doing:我通过执行以下操作来交叉检查:

aggregate(iris[1:4], list(iris$Species), mean)

which shows that for example the mean of the different variables varies across species.这表明例如不同变量的平均值因物种而异。

Why is dplyr::group_by not doing what it should?为什么dplyr::group_by没有做它应该做的事?

i posted the output as best I could, sorry and thank you for the comprehension.我尽我所能发布了输出,抱歉并感谢您的理解。

Try using .data$ instead of iris$ when declaring the variables.在声明变量时尝试使用 .data$ 而不是 iris$ 。 I had the same problem and this fixed it.我有同样的问题,这解决了它。

The reason the group_by call does not appear to do anything is because the data pronoun .data is not being used in the summary definition. group_by调用似乎没有做任何事情的原因是因为数据代词.data没有在摘要定义中使用。 As written, the summary table is constructed based on the whole iris data set, regardless of any grouping or subsetting.正如所写的,汇总表是基于整个iris数据集构建的,而不管任何分组或子集。 The .data pronoun is needed so that the the tidyverse tools behind summary_table use the correct scoping.需要.data代词,以便summary_table后面的tidyverse工具使用正确的范围。

library(datasets)
library(qwraps2)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

data("iris")
options(qwraps2_markup = "markdown")

our_summary1 <-
  list("Sepal Length" =
       list("min" = ~ min(.data$Sepal.Length),
            "max" = ~ max(.data$Sepal.Length),
            "mean (sd)" = ~ qwraps2::mean_sd(.data$Sepal.Length)),
       "Sepal Width" =
       list("min" = ~ min(.data$Sepal.Width),
            "median" = ~ median(.data$Sepal.Width),
            "max" = ~ max(.data$Sepal.Width),
            "mean (sd)" = ~ qwraps2::mean_sd(.data$Sepal.Width)),
       "Petal Length" =
       list("min" = ~ min(.data$Petal.Length),
            "max" = ~ max(.data$Petal.Length),
            "mean (sd)" = ~ qwraps2::mean_sd(.data$Sepal.Length)),
       "Petal Width" =
       list("min" = ~ min(.data$Petal.Width),
            "max" = ~ max(.data$Petal.Width),
            "mean (sd)" = ~ qwraps2::mean_sd(.data$Petal.Width)),
        "Species" =
       list("Setosa" = ~ qwraps2::n_perc0(.data$Species == "setosa"),
            "Versicolor"  = ~ qwraps2::n_perc0(.data$Species == "versicolor"),
            "Virginica"  = ~ qwraps2::n_perc0(.data$Species == "virginica"))
       )


bytype <- qwraps2::summary_table(dplyr::group_by(iris,Species),our_summary1)
bytype
#> 
#> 
#> |                        |Species: setosa (N = 50) |Species: versicolor (N = 50) |Species: virginica (N = 50) |
#> |:-----------------------|:------------------------|:----------------------------|:---------------------------|
#> |**Sepal Length**        |&nbsp;&nbsp;             |&nbsp;&nbsp;                 |&nbsp;&nbsp;                |
#> |&nbsp;&nbsp; min        |4.3                      |4.9                          |4.9                         |
#> |&nbsp;&nbsp; max        |5.8                      |7.0                          |7.9                         |
#> |&nbsp;&nbsp; mean (sd)  |5.01 &plusmn; 0.35       |5.94 &plusmn; 0.52           |6.59 &plusmn; 0.64          |
#> |**Sepal Width**         |&nbsp;&nbsp;             |&nbsp;&nbsp;                 |&nbsp;&nbsp;                |
#> |&nbsp;&nbsp; min        |2.3                      |2.0                          |2.2                         |
#> |&nbsp;&nbsp; median     |3.4                      |2.8                          |3.0                         |
#> |&nbsp;&nbsp; max        |4.4                      |3.4                          |3.8                         |
#> |&nbsp;&nbsp; mean (sd)  |3.43 &plusmn; 0.38       |2.77 &plusmn; 0.31           |2.97 &plusmn; 0.32          |
#> |**Petal Length**        |&nbsp;&nbsp;             |&nbsp;&nbsp;                 |&nbsp;&nbsp;                |
#> |&nbsp;&nbsp; min        |1.0                      |3.0                          |4.5                         |
#> |&nbsp;&nbsp; max        |1.9                      |5.1                          |6.9                         |
#> |&nbsp;&nbsp; mean (sd)  |5.01 &plusmn; 0.35       |5.94 &plusmn; 0.52           |6.59 &plusmn; 0.64          |
#> |**Petal Width**         |&nbsp;&nbsp;             |&nbsp;&nbsp;                 |&nbsp;&nbsp;                |
#> |&nbsp;&nbsp; min        |0.1                      |1.0                          |1.4                         |
#> |&nbsp;&nbsp; max        |0.6                      |1.8                          |2.5                         |
#> |&nbsp;&nbsp; mean (sd)  |0.25 &plusmn; 0.11       |1.33 &plusmn; 0.20           |2.03 &plusmn; 0.27          |
#> |**Species**             |&nbsp;&nbsp;             |&nbsp;&nbsp;                 |&nbsp;&nbsp;                |
#> |&nbsp;&nbsp; Setosa     |50 (100)                 |0 (0)                        |0 (0)                       |
#> |&nbsp;&nbsp; Versicolor |0 (0)                    |50 (100)                     |0 (0)                       |
#> |&nbsp;&nbsp; Virginica  |0 (0)                    |0 (0)                        |50 (100)                    |

Created on 2020-03-01 by the reprex package (v0.3.0)reprex 包(v0.3.0) 于 2020 年 3 月 1 日创建

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM