[英]R: How to summarize a data.table in long format using the column names of the original wide table as a rows?
我有寬格式的數據,我想用長格式的中位數和 IQR 對其進行總結。 長格式的總結工作正常,但我需要清理和轉換列名以從中生成行。
如何做到這一點?
具有所需輸出的 MWE:
require(data.table)
# Demo data
dt <- data.table("c" = c(1,2,3,1,1,2,3,2), "323.1" = c(1,2,3,4,1,3,4,8), "454.3" = c(1,3,2,4,5,6,7,8)) # Real data has plenty more columns in the style "number.number"
# Create summary
dt[, unlist(recursive=FALSE, lapply(
.(med = median, iqr = IQR),
function(f) lapply(.SD, f)
)), by=.(c)]
# Desired output:
(dt_output <- data.table("c" = c(1,1,2,2,3,3), "var" = c("323.1", "454.3"), "med" = c(1,3,3.5,4,6,4.5), "iqr" = c(1.5,3,0.5,2,2.5,2.5)))
總結后的輸出:
c med.323.1 med.454.3 iqr.323.1 iqr.454.3
1: 1 1.0 4.0 1.5 2.0
2: 2 3.0 6.0 3.0 2.5
3: 3 3.5 4.5 0.5 2.5
期望的輸出:
c var med iqr
1: 1 323.1 1.0 1.5
2: 1 454.3 3.0 3.0
3: 2 323.1 3.5 0.5
4: 2 454.3 4.0 2.0
5: 3 323.1 6.0 2.5
6: 3 454.3 4.5 2.5
謝謝!
一種選擇是在執行計算之前先融化:
melt(dt, id.vars="c")[, .(med = median(value), iqr = IQR(value)), .(c, variable)]
輸出:
c variable med iqr
1: 1 323.1 1.0 1.5
2: 2 323.1 3.0 3.0
3: 3 323.1 3.5 0.5
4: 1 454.3 4.0 2.0
5: 2 454.3 6.0 2.5
6: 3 454.3 4.5 2.5
您可以使用gather()
來重塑您的數據,然后summarise
以獲取如下統計信息:
library(tidyverse)
dt %>% gather(., var, value, -c) %>%
group_by(c, var) %>% summarise(med = median(value),
iqr = IQR(value))
# A tibble: 6 x 4
# Groups: c [3]
c var med iqr
<dbl> <chr> <dbl> <dbl>
1 1 323.1 1 1.5
2 1 454.3 4 2
3 2 323.1 3 3
4 2 454.3 6 2.5
5 3 323.1 3.5 0.5
6 3 454.3 4.5 2.5
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.