如何按R中的組匯總日期數據

Question

我想將以下示例數據總結為一個新的數據框，如下所示：

人口，樣本量（N），完成百分比（％）

樣本數量是每個人口的所有記錄的計數。 我可以使用table命令或輕按來執行此操作。 完成百分比是帶有“結束日期”的記錄的百分比（假定所有沒有“結束日期”的記錄都沒有完成。這是我迷路的地方！

樣本數據

 sample <- structure(list(Population = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 
    2L, 2L, 3L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 
    1L, 2L, 2L, 3L, 3L, 3L, 3L, 1L, 1L, 3L, 3L, 3L, 3L), .Label = c("Glommen", 
    "Kaseberga", "Steninge"), class = "factor"), Start_Date = structure(c(16032, 
    16032, 16032, 16032, 16032, 16036, 16036, 16036, 16037, 16038, 
    16038, 16039, 16039, 16039, 16039, 16039, 16039, 16041, 16041, 
    16041, 16041, 16041, 16041, 16044, 16044, 16045, 16045, 16045, 
    16045, 16048, 16048, 16048, 16048, 16048, 16048), class = "Date"), 
        End_Date = structure(c(NA, 16037, NA, NA, 16036, 16043, 16040, 
        16041, 16042, 16042, 16042, 16043, 16043, 16043, 16043, 16043, 
        16043, 16045, 16045, 16045, 16045, 16045, NA, 16048, 16048, 
        16049, 16049, NA, NA, 16052, 16052, 16052, 16052, 16052, 
        16052), class = "Date")), .Names = c("Population", "Start_Date", 
    "End_Date"), row.names = c(NA, 35L), class = "data.frame")

Answer 1

您可以使用split / apply / combine來做到這一點：

spl = split(sample, sample$Population)
new.rows = lapply(spl, function(x) data.frame(Population=x$Population[1],
                                              SampleSize=nrow(x),
                                              PctComplete=sum(!is.na(x$End_Date))/nrow(x)))
combined = do.call(rbind, new.rows)
combined

#           Population SampleSize PctComplete
# Glommen      Glommen         13   0.6923077
# Kaseberga  Kaseberga          7   1.0000000
# Steninge    Steninge         15   0.8666667

一句話警告： sample是基本函數的名稱，因此您應該為數據框選擇一個不同的名稱。

Answer 2

使用plyr軟件包很容易：

library(plyr)
ddply(sample, .(Population), summarize, 
      Sample_Size = length(End_Date),
      Percent_Completed = mean(!is.na(End_Date)) * 100)

#   Population Sample_Size Percent_Completed
# 1    Glommen          13          69.23077
# 2  Kaseberga           7         100.00000
# 3   Steninge          15          86.66667

如何按R中的組匯總日期數據

問題描述

2 個解決方案

解決方案1
2 已采納 2013-12-19 20:50:26

解決方案2
2 2013-12-19 20:52:11

如何按R中的組匯總日期數據

問題描述

2 個解決方案

解決方案1 2 已采納 2013-12-19 20:50:26

解決方案2 2 2013-12-19 20:52:11

解決方案1
2 已采納 2013-12-19 20:50:26

解決方案2
2 2013-12-19 20:52:11