I'd like to sum a value in a given column for each unique combination of two other columns:
For example I'd like to transform the following dataframe from:
Week Day Value
1 1 1
1 2 3
1 3 4
2 1 2
2 2 2
2 3 3
to:
Week Day Value Sum
1 1 1 1
1 2 3 4
1 3 4 8
2 1 2 2
2 2 2 4
2 3 3 7
I think a for
loop would do what I want - but I am completely lost at this point - any and all help appreciated...
In base R, you can use ave()
:
x <- read.table(header=T, text="
Week Day Value
1 1 1
1 2 3
1 3 4
2 1 2
2 2 2
2 3 3
")
x$Sum <- ave(x$Value, x$Week, FUN=cumsum)
> x
Week Day Value Sum
1 1 1 1 1
2 1 2 3 4
3 1 3 4 8
4 2 1 2 2
5 2 2 2 4
6 2 3 3 7
Suggest to try dplyr
. Quite a workhorse in data manipulation. From the desired output, you seem to try to get cumulative sum based on Week.
df = read.table(text="Week Day Value
1 1 1
1 2 3
1 3 4
2 1 2
2 2 2
2 3 3", header=T)
library(dplyr)
df %>% group_by(Week) %>% mutate(Sum = cumsum(Value))
# you get
Source: local data frame [6 x 4]
Groups: Week
Week Day Value Sum
1 1 1 1 1
2 1 2 3 4
3 1 3 4 8
4 2 1 2 2
5 2 2 2 4
6 2 3 3 7
Or you could try data.table
, another tool which is great for data of larger size. Fast and memory efficient.
setDT(df)[, Sum := cumsum(Value), by = Week][]
Week Day Value Sum
1: 1 1 1 1
2: 1 2 3 4
3: 1 3 4 8
4: 2 1 2 2
5: 2 2 2 4
6: 2 3 3 7
Actually, for loops are probably a bad way of looking at this - they're not very efficient on data frames. Instead I'd recommend data.table :
#Turn into a data.table.
dt <- data.table(df)
#Sum, for each unique combination
dt <- dt[, j = list(value_sum = sum(Value)), by = c("Week","Day")]
Your actual example seems to just sum for each unique week , in which case, drop "Day" from "by".
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.