将函数应用于每列数据框

Question

I have the following (very large) dataframe: 我有以下（非常大）的数据帧：

     id         epoch
1     0     1.141194e+12
2     1     1.142163e+12
3     2     1.142627e+12
4     2     1.142627e+12
5     3     1.142665e+12
6     3     1.142665e+12
7     4     1.142823e+12
8     5     1.143230e+12
9     6     1.143235e+12
10    6     1.143235e+12

For every unique ID, I now want to get the difference between its maximum and minimum time (epoch timestamp). 对于每个唯一ID，我现在想要获得其最大和最小时间（纪元时间戳）之间的差异。 There are IDs with many more occurences than in the example above, in case it is relevant. 如果相关，则存在比上述示例中更多出现的ID。 I haven't worked much with R yet and tried the following: 我还没有和R一起工作，并尝试了以下方法：

unique = data.frame(as.numeric(unique(df$id)))
differences = apply(unique, 1, get_duration)

get_duration = function(id) {
  maxTime = max(df$epoch[which(df$id == id)])
  minTime = min(df$epoch[which(df$id == id)])
  return ((maxTime - minTime) / 1000)
}

It works, but is incredibly slow. 它有效，但速度非常慢。 What would be a faster approach? 什么是更快的方法？

Answer 1

A couple of approaches. 几种方法。 In base R : 在基地R ：

tapply(df$epoch,df$id,function(x) (max(x)-min(x))/1000)

With data.table : 使用data.table ：

require(data.table)
setDT(df)
df[,list(d=(max(epoch)-min(epoch))/1000),by=id]

Answer 2

This can be done easily in dplyr 这可以在dplyr轻松dplyr

require(dplyr)
df %>% group_by(id) %>% summarize(diff=(max(epoch)-min(epoch))/1000)

Answer 3

Use the filter by id just once 只使用ID过滤一次

subset = df$epoch[which(df$id == id)]
maxTime = max(subset)
minTime = min(subset)

将函数应用于每列数据框

问题描述

3 个解决方案

解决方案1
3 已采纳 2015-12-15 21:50:39

解决方案2
1 2015-12-15 21:54:06

解决方案3
-1 2015-12-15 21:42:34

将函数应用于每列数据框

问题描述

3 个解决方案

解决方案1 3 已采纳 2015-12-15 21:50:39

解决方案2 1 2015-12-15 21:54:06

解决方案3 -1 2015-12-15 21:42:34

解决方案1
3 已采纳 2015-12-15 21:50:39

解决方案2
1 2015-12-15 21:54:06

解决方案3
-1 2015-12-15 21:42:34