![](/img/trans.png)
[英]R - apply diff() function or equivalent self-defined function on multiple columns in a data.table
[英]R data.table way to create summary statistics table with self-defined function
我正在轉換到 data.table 的過程中,到目前為止還沒有找到一種 data.table 方法來創建一個基於自定義函數的匯總統計表。 到目前為止,我已經使用 dplyr 來完成此任務,為此我提供了下面的代碼。 是否可以使用 data.table 以簡潔的方式實現類似的事情?
library(dplyr)
library(mlbench)
data(BostonHousing)
df <- BostonHousing
fun_stats <- function(x) {
min <- min(x, na.rm = TRUE)
max <- max(x, na.rm = TRUE)
mean <- mean(x, na.rm = TRUE)
summary <- list(min = min, max = max, mean = mean)
}
stats <- df %>%
select_if(is.numeric) %>%
purrr::map(fun_stats) %>%
bind_rows(., .id = "var") %>%
mutate(across(where(is.numeric)))
library(data.table)
library(mlbench)
data(BostonHousing)
dt <- as.data.table(BostonHousing)
fun_stats <- function(x) {
min <- min(x, na.rm = TRUE)
max <- max(x, na.rm = TRUE)
mean <- mean(x, na.rm = TRUE)
summary <- list(min = min, max = max, mean = mean)
}
dt[, rbindlist(lapply(.SD, fun_stats), idcol = "var"),
.SDcols = is.numeric]
#> var min max mean
#> <char> <num> <num> <num>
#> 1: crim 0.00632 88.9762 3.6135236
#> 2: zn 0.00000 100.0000 11.3636364
#> 3: indus 0.46000 27.7400 11.1367787
#> 4: nox 0.38500 0.8710 0.5546951
#> 5: rm 3.56100 8.7800 6.2846344
#> 6: age 2.90000 100.0000 68.5749012
#> 7: dis 1.12960 12.1265 3.7950427
#> 8: rad 1.00000 24.0000 9.5494071
#> 9: tax 187.00000 711.0000 408.2371542
#> 10: ptratio 12.60000 22.0000 18.4555336
#> 11: b 0.32000 396.9000 356.6740316
#> 12: lstat 1.73000 37.9700 12.6530632
#> 13: medv 5.00000 50.0000 22.5328063
由reprex 包(v2.0.1)於 2022 年 6 月 24 日創建
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.