[英]R - Cumulative sum of absolute differences based on other column and row
My data has the following format: 我的数据具有以下格式:
See here below the dataset: 请参阅下面的数据集:
structure(c("", "", "running", "running", "running", "running",
"running", "running", "running", "", "504", "678", "268", "475",
"675", "796", "745", "693", "665", "488"), .Dim = c(10L, 2L), .Dimnames = list(
NULL, c("e", "f")))
I would like to add a third column that gives the total distance that the machine has been running (by adding all the absolute differences since the machine started to run). 我想添加第三列,以给出机器已经运行的总距离(通过添加自机器开始运行以来的所有绝对差值)。 See here below the desired output:
请参阅以下所需的输出:
e f Output
[1,] "" "504" ""
[2,] "" "678" ""
[3,] "running" "268" "0"
[4,] "running" "475" "207"
[5,] "running" "675" "407"
[6,] "running" "796" "528"
[7,] "running" "745" "579"
[8,] "running" "693" "631"
[9,] "running" "665" "659"
[10,] "" "488" ""
I tried to write some code in R to get this in an elegant way, but my programming skills are too limited for the moment. 我试图用R编写一些代码来以一种优雅的方式来获得它,但是目前我的编程技能太有限了。 Is there anybody that knows a solution for this problem?
有谁知道解决这个问题的方法吗? Thank you on beforehand!
预先谢谢您!
Actually I have to confess I did not fully understand your question... 其实我不得不承认我没有完全理解你的问题。
Do you mean something like this? 你的意思是这样吗?
a <- c("", "", "running", "running", "running", "running",
"running", "running", "running", "", "504", "678", "268", "475",
"675", "796", "745", "693", "665", "488")
a <- data.frame(matrix(a, ncol=2), stringsAsFactors=FALSE)
a[,2] <- as.numeric(a[,2])
subs.a <- subset(a, X1=='running')
abs(diff(subs.a$X2))
cumsum(diff(subs.a$X2))
this returns 这回来
abs(diff(subs.a$X2))
[1] 207 200 121 51 52 28
cumsum(abs(diff(subs.a$X2)))
[1] 207 407 528 579 631 659
inserting it into the data.frame
: 将其插入到
data.frame
:
a[a$X1=='running', 'CumSum'] <- c(0,cumsum(abs(diff(subs.a$X2))))
This gives you the desired data frame: 这为您提供了所需的数据帧:
df <- data.frame(structure(c("", "", "running", "running", "running", "running",
"running", "running", "running", "", "504", "678", "268", "475",
"675", "796", "745", "693", "665", "488"), .Dim = c(10L, 2L), .Dimnames = list(
NULL, c("e", "f"))), stringsAsFactors = F)
runs <- df$e == "running"
df$f <- as.numeric(df$f)
diff <- abs(diff(df$f[runs]))
df$Output[runs] <- 0
df$Output[runs][-1] <- cumsum(diff)
You get: 你得到:
e f Output
1 504 NA
2 678 NA
3 running 268 0
4 running 475 207
5 running 675 407
6 running 796 528
7 running 745 579
8 running 693 631
9 running 665 659
10 488 NA
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.