简体   繁体   English

R-根据其他行和列中的值进行条件计算

[英]R- Conditional calculation based on values in other row and column

My data has the following format: - first column: indication if the machine is running - second column: total time that the machine is running 我的数据具有以下格式:-第一列:指示计算机是否正在运行-第二列:计算机运行的总时间

See here below the dataset: 请参阅下面的数据集:

structure(c("", "running", "running", "running", "", "", "", 
"running", "running", "", "10", "15", "30", "2", "5", "17", "47", 
"12", "57", "87"), .Dim = c(10L, 2L), .Dimnames = list(NULL, 
    c("c", "v")))

I would like to add a third column that gives the total time that the machine has been running (by adding all the times since the machine started to run). 我想添加第三列,以给出计算机已运行的总时间(通过添加自计算机开始运行以来的所有时间)。 See here below the desired output: 请参阅以下所需的输出:

 [1,] ""        "10" "0"   
 [2,] "running" "15" "15"  
 [3,] "running" "30" "45"  
 [4,] "running" "2"  "47"  
 [5,] ""        "5"  "0"   
 [6,] ""        "17" "0"   
 [7,] ""        "47" "0"   
 [8,] "running" "12" "12"  
 [9,] "running" "57" "69"  
[10,] ""        "87" "0" 

I tried to write some code in R to get this in an elegant way, but my programming skills are too limited for the moment. 我试图用R编写一些代码来以一种优雅的方式来获得它,但是目前我的编程技能太有限了。 Is there anybody that knows a solution for this problem? 有谁知道解决这个问题的方法吗? Thank you on beforehand! 预先谢谢您!

First we transform your data to a more appropriate data structure that can contain mixed data types: 首先,我们将您的数据转换为可以包含混合数据类型的更合适的数据结构:

m <- structure(c("", "running", "running", "running", "", "", "", 
                 "running", "running", "", "10", "15", "30", "2", "5", "17", "47", 
                 "12", "57", "87"), .Dim = c(10L, 2L), .Dimnames = list(NULL, 
                                                                        c("c", "v")))
DF <- as.data.frame(m, stringsAsFactors = FALSE)
DF[] <- lapply(DF, type.convert, as.is = TRUE)

Then we can do this easily with package data.table: 然后,我们可以使用package data.table轻松地做到这一点:

library(data.table)
setDT(DF)
DF[, total := cumsum(v), by = rleid(c)]
DF[c == "", total := 0]
#          c  v total
# 1:         10     0
# 2: running 15    15
# 3: running 30    45
# 4: running  2    47
# 5:          5     0
# 6:         17     0
# 7:         47     0
# 8: running 12    12
# 9: running 57    69
#10:         87     0

Here is a simple solution using base R: 这是使用基数R的简单解决方案:

DF$total <- ave(DF$v, DF$c, cumsum(DF$c == ""), FUN = cumsum)
DF$total[DF$c == ""] <- 0

> DF
         c  v total
1          10     0
2  running 15    15
3  running 30    45
4  running  2    47
5           5     0
6          17     0
7          47     0
8  running 12    12
9  running 57    69
10         87     0

We could use dplyr 我们可以使用dplyr

library(dplyr)
 DF %>% 
   group_by(cumsum(c==''),c) %>%
   mutate(total=replace(cumsum(v), c=='', 0) )

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM