使用by重塑data.table

Question

我有一個data.table ，其中我想要的數據以對角線的方式構造。

library(data.table)
month <- c(201406, 201406, 201406, 201406, 201406, 201406, 201406, 201406, 
201406, 201406, 201406, 201406)
code <- c("498A01", "498A01", "498A01", "498A01", "498A01", "498A01", "498A01", "498A01", 
"498A01", "498A01", "498A01", "498A01")
col.a <- c("service", "base charge", "", "", "", "", "", "", "", "", "", "")
col.b <- c("", "", "description", "per unit", "", "", "", "", "", "", "", "")
col.c <- c("", "", "", "", "rate", 6859, "", "", "", "", "", "")
col.d <- c("", "", "", "", "", "", "quantity", 1, "", "", "", "")
col.e <- c("", "", "", "", "", "", "", "", "total charge", 6859, "", "")
col.f <- c("", "", "", "", "", "", "", "", "", "", "", "")   
dt <- data.table(month, code, col.a, col.b, col.c, col.d, col.e, col.f)

但是，我需要以一種更加連貫的方式來組織數據，以簡化dt因為我對data.table還是很data.table ，我想知道是否有一種直接的方法。

對於col.a我知道以下內容適用於一列：

dt <- dt[col.a != "", 1:8, by = .(code, month)

但是，當我嘗試多列時，它會返回0 obs的數據表。 我想我可以對所有列進行此操作，然后進行某種合並，但這似乎效率低下且麻煩。 有沒有更好的辦法？

我想要的輸出是：

   month   code      col.a      col.b     col.c   col.d       col.e    col.f
1: 201406 498A01     service description   rate quantity total charge       
2: 201406 498A01 base charge    per unit   6859        1         6859

因此，對於code和month每個唯一組合，我想刪除空單元格並折疊數據以使其看起來像上面一樣。 我需要保留col.f1因為它可能並不總是空白。

任何建議將不勝感激。

Answer 1

您是否正在尋找類似的東西

dt[, lapply(.SD, function(x) x[x!=""][1:2]), by=.(month, code)]

輸出：

    month   code       col.a       col.b col.c    col.d        col.e col.f
1: 201406 498A01     service description  rate quantity total charge  <NA>
2: 201406 498A01 base charge    per unit  6859        1         6859  <NA>

或在基數R中：

do.call(rbind, by(dt, paste(dt$month, dt$code), 
    function(y) do.call(cbind, lapply(y, function(x) x[x!=""][1:2]))))

輸出：

     month    code     col.a         col.b         col.c  col.d      col.e          col.f
[1,] "201406" "498A01" "service"     "description" "rate" "quantity" "total charge" NA   
[2,] "201406" "498A01" "base charge" "per unit"    "6859" "1"        "6859"         NA

使用by重塑data.table

問題描述

1 個解決方案

解決方案1
3 2018-08-03 00:36:33

使用by重塑data.table

問題描述

1 個解決方案

解決方案1 3 2018-08-03 00:36:33

解決方案1
3 2018-08-03 00:36:33