I'm having a beginner's issue aggregating the data for a category of data, creating a new column with the sum of each category's data for each observance.
I'd like the following data:
PIN Balance
221 5000
221 2000
221 1000
554 4000
554 4500
643 6000
643 4000
To look like:
PIN Balance Total
221 5000 8000
221 2000 8000
221 1000 8000
554 4000 8500
554 4500 8500
643 6000 10000
643 4000 10000
I've tried using aggregate: output <- aggregate(df$Balance ~ df$PIN, data = df, sum) but haven't been able to get the data back into my original dataset as the number of obsverations were off.
You can use dplyr
to do what you want. We first group_by
PIN
and then create a new column Total
using mutate
that is the sum of the grouped Balance
:
library(dplyr)
res <- df %>% group_by(PIN) %>% mutate(Total=sum(Balance))
Using your data as a data frame df
:
df <- structure(list(PIN = c(221L, 221L, 221L, 554L, 554L, 643L, 643L
), Balance = c(5000L, 2000L, 1000L, 4000L, 4500L, 6000L, 4000L
)), .Names = c("PIN", "Balance"), class = "data.frame", row.names = c(NA,
-7L))
## PIN Balance
##1 221 5000
##2 221 2000
##3 221 1000
##4 554 4000
##5 554 4500
##6 643 6000
##7 643 4000
We get the expected result:
print(res)
##Source: local data frame [7 x 3]
##Groups: PIN [3]
##
## PIN Balance Total
## <int> <int> <int>
##1 221 5000 8000
##2 221 2000 8000
##3 221 1000 8000
##4 554 4000 8500
##5 554 4500 8500
##6 643 6000 10000
##7 643 4000 10000
Or we can use data.table
:
library(data.table)
setDT(df)[,Table:=sum(Balance),by=PIN][]
## PIN Balance Total
##1: 221 5000 8000
##2: 221 2000 8000
##3: 221 1000 8000
##4: 554 4000 8500
##5: 554 4500 8500
##6: 643 6000 10000
##7: 643 4000 10000
Consider a base R solution with a sapply()
conditional sum approach:
df <- read.table(text="PIN Balance
221 5000
221 2000
221 1000
554 4000
554 4500
643 6000
643 4000", header=TRUE)
df$Total <- sapply(seq(nrow(df)), function(i){
sum(df[df$PIN == df$PIN[i], c("Balance")])
})
# PIN Balance Total
# 1 221 5000 8000
# 2 221 2000 8000
# 3 221 1000 8000
# 4 554 4000 8500
# 5 554 4500 8500
# 6 643 6000 10000
# 7 643 4000 10000
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.