简体   繁体   English

R data.table 使用自定义函数创建汇总统计表的方式

[英]R data.table way to create summary statistics table with self-defined function

I am in the process of converting to data.table and so far have not been able to find a data.table way to create a table with summary statistics based on a self-defined function.我正在转换到 data.table 的过程中,到目前为止还没有找到一种 data.table 方法来创建一个基于自定义函数的汇总统计表。 Until now, I have used dplyr to accomplish this, for which I provide the code below.到目前为止,我已经使用 dplyr 来完成此任务,为此我提供了下面的代码。 Is it possible to achieve a similar thing in a neat way using data.table?是否可以使用 data.table 以简洁的方式实现类似的事情?

library(dplyr)
library(mlbench)
data(BostonHousing)
df <- BostonHousing

fun_stats <- function(x) {
  min <- min(x, na.rm = TRUE)
  max <- max(x, na.rm = TRUE)
  mean <- mean(x, na.rm = TRUE)
  summary <- list(min = min, max = max, mean = mean)
}

stats <- df %>%
  select_if(is.numeric) %>%
  purrr::map(fun_stats) %>%
  bind_rows(., .id = "var") %>%
  mutate(across(where(is.numeric)))
library(data.table)
library(mlbench)
data(BostonHousing)
dt <- as.data.table(BostonHousing)

fun_stats <- function(x) {
  min <- min(x, na.rm = TRUE)
  max <- max(x, na.rm = TRUE)
  mean <- mean(x, na.rm = TRUE)
  summary <- list(min = min, max = max, mean = mean)
}

dt[, rbindlist(lapply(.SD, fun_stats), idcol = "var"), 
   .SDcols = is.numeric]
#>         var       min      max        mean
#>      <char>     <num>    <num>       <num>
#>  1:    crim   0.00632  88.9762   3.6135236
#>  2:      zn   0.00000 100.0000  11.3636364
#>  3:   indus   0.46000  27.7400  11.1367787
#>  4:     nox   0.38500   0.8710   0.5546951
#>  5:      rm   3.56100   8.7800   6.2846344
#>  6:     age   2.90000 100.0000  68.5749012
#>  7:     dis   1.12960  12.1265   3.7950427
#>  8:     rad   1.00000  24.0000   9.5494071
#>  9:     tax 187.00000 711.0000 408.2371542
#> 10: ptratio  12.60000  22.0000  18.4555336
#> 11:       b   0.32000 396.9000 356.6740316
#> 12:   lstat   1.73000  37.9700  12.6530632
#> 13:    medv   5.00000  50.0000  22.5328063

Created on 2022-06-24 by the reprex package (v2.0.1)reprex 包(v2.0.1)于 2022 年 6 月 24 日创建

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 R - 应用 diff() function 或等效的自定义 function 在 Z1639B13BE20377B20 中的多个列上 - R - apply diff() function or equivalent self-defined function on multiple columns in a data.table 一个简单的可重现示例,在 ZE1E1D3D405731837E 中的自定义 function 中将 arguments 传递给 data.table - A simple reproducible example to pass arguments to data.table in a self-defined function in R 使用 dot dot dot (...) 表示从自定义 function 返回的列,用于 data.table object - Use dot dot dot (…) to indicate columns returned from a self-defined function for a data.table object 使用 data.table 和自定义 function 从字符串变量中提取变量 - extract variable from string variable using data.table and self-defined function R 中有没有办法从数据列表创建汇总统计表 - Is there a way in R to create summary statistics table from a list of data R data.table用户定义函数 - R data.table user defined function R data.table摘要函数的后缀或前缀 - R data.table summary function suffix or prefix 复杂的汇总功能 - 是否可以使用R data.table包解决? - Complicated summary function — is it possible to solve with R data.table package? 如何使用 R 中的 data.table 包计算汇总统计量(标准误差、上下置信区间) - How to Calculate Summary Statistics (Standard Error, and Upper and Lower Confidence intervals) using the package data.table in R 在R的数据帧列表上应用自定义的function - Apply self-defined function on list of data frames in R
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM