简体   繁体   English

GNU R:在sapply上使用sapply

[英]GNU R:Use sapply on sapply

I have a list of statuses. 我有一份状态清单。 Each list element contains the status of a sensor for every minute of a day (1440 entries, 0 or 1). 每个列表元素都包含一天中每一分钟的传感器状态(1440个条目,0或1)。 The list contains all sensors. 该列表包含所有传感器。

For example, statuses[[3]] gives a vector with 1440 entries, containing all the 0's and 1's of each minute. 例如, statuses[[3]]给出具有1440个条目的向量,其中包含每分钟的所有0和1。

The statuses of all sensors in, let's say, minute 800 is: 假设在第800分钟,所有传感器的状态为:

sapply(statuses,'[',800)

I'd like to get the number of active sensors (ie showing 1) per minute. 我想获取每分钟活动传感器的数量(即显示1个)。 How do I do that? 我怎么做? Somehow one has to put another sapply() around this... 不知何故,人们不得不在此周围放置另一个sapply()

The solution using a for loop would look like this 使用for循环的解决方案如下所示

status_ones <- rep(0,1440)
for (k in 1:1440){
  status_ones[k] <- sum(sapply(statuses,'[',k))
}

It seems to me there are several ways to accomplish what you want; 在我看来,有几种方法可以实现您想要的目标; this is what jumped out at me first: Since the length of each element of the list is the same, you can treat it as a data frame and use apply. 这是我首先想到的:由于列表中每个元素的长度相同,因此可以将其视为数据框并使用apply。 I illustrate this approach below using simulated data that I believe matches your description of your data (this would be for five observations of three sensors): 我在下面使用模拟数据说明这种方法,我认为该模拟数据与您对数据的描述相匹配(这将用于三个传感器的五个观测值):

set.seed(42)
statuses <- lapply(1:3, function(x) sample(0:1, 5, replace=TRUE))
statuses
# [[1]]
# [1] 1 1 0 1 1
# 
# [[2]]
# [1] 1 1 0 1 1
# 
# [[3]]
# [1] 0 1 1 0 0
status_ones <- apply(as.data.frame(statuses), 1, sum)
status_ones
# [1] 2 3 1 2 2

You can easily manually confirm this gives the result you want with this small example. 您可以通过这个小示例轻松地手动确认这是否提供了所需的结果。 Below you can see the speed benefit of this approach relative to the for loop approach or using sapply on sapply -- I created a larger sample (1440 observations each for three sensors) and used benchmark to see the speed differences: 下面你可以看到相对于这种方法的速度优势for循环的方法或使用sapplysapply -我创建了一个大样本(每三个传感器1440条意见),并使用benchmark看到速度的差异:

library(rbenchmark)
statuses <- lapply(1:3, function(x) sample(0:1, 1440, replace=TRUE))
benchmark(apply=apply(as.data.frame(statuses), 1, sum),
          sapply=sapply(1:1440, function(x) sum(sapply(statuses, '[', x))),
          loop=for ( i in 1:1440 ) { sum(sapply(statuses, '[', i)) },
          columns=c('test', 'elapsed', 'relative', 'user.self'),
          order='relative')
    test elapsed relative user.self
1  apply   0.883    1.000     0.660
2 sapply   6.115    6.925     5.616
3   loop   6.305    7.140     5.776

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM