[英]Calculations between two columns in a data frame in R
我想基於其他兩個列創建一個新的expected
列。 通過添加列const
中的值並減去列value
的值來創建新列。
我的數據:
df<-data.frame(product = rep(c('A','B'),each=4), data = seq(as.Date("2020-01-01"), as.Date("2020-01-04"), by = "day"),
value = c(10, 15, 0, 5, 20, 5, 10, 0), const = c(100, 0, 10, 0, 100, 0, 0, 10),
expected = c(90, 75, 85, 80, 80, 75, 65, 75))
> df
product data value const expected
1 A 2020-01-01 10 100 90
2 A 2020-01-02 15 0 75
3 A 2020-01-03 0 10 85
4 A 2020-01-04 5 0 80
5 B 2020-01-01 20 100 80
6 B 2020-01-02 5 0 75
7 B 2020-01-03 10 0 65
8 B 2020-01-04 0 10 75
編輯數據:
TD<-data.frame(product = rep("A",4), data = seq(as.Date("2020-01-01"), as.Date("2020-01-04"), by = "day"),
value = c(15, 1, 2, 1, 0), value2 = c(10, 0, 10, 0, 100))
TD <- TD %>% group_by(product) %>% mutate(expected1 = cumsum(value2) - cumsum(value))
TD
product data value value2 expected1
<fct> <date> <dbl> <dbl> <dbl>
1 A 2020-01-01 15 10 -5
2 A 2020-01-02 1 0 -6
3 A 2020-01-03 2 10 2
4 A 2020-01-04 1 0 1
5 A 2020-01-05 0 100 101
TD_expected
product data value value2 expected1
1 A 2020-01-01 15 10 -5
2 A 2020-01-02 1 0 -6
3 A 2020-01-03 2 10 8
4 A 2020-01-04 1 0 7
5 A 2020-01-05 0 100 107
注意:當 value2 大於 value1 時,我們將 value2 分配給預期的
您可以使用ave
和cumsum
。
df$expected <- ave(df$const - df$value, df$product, FUN=cumsum)
df
# product data value const expected
#1 A 2020-01-01 10 100 90
#2 A 2020-01-02 15 0 75
#3 A 2020-01-03 0 10 85
#4 A 2020-01-04 5 0 80
#5 B 2020-01-01 20 100 80
#6 B 2020-01-02 5 0 75
#7 B 2020-01-03 10 0 65
#8 B 2020-01-04 0 10 75
您可以按組取const
和value
的cumsum
,然后減去
library(dplyr)
df %>% group_by(product) %>% mutate(expected1 = cumsum(const) - cumsum(value))
# product data value const expected expected1
# <fct> <date> <dbl> <dbl> <dbl> <dbl>
#1 A 2020-01-01 10 100 90 90
#2 A 2020-01-02 15 0 75 75
#3 A 2020-01-03 0 10 85 85
#4 A 2020-01-04 5 0 80 80
#5 B 2020-01-01 20 100 80 80
#6 B 2020-01-02 5 0 75 75
#7 B 2020-01-03 10 0 65 65
#8 B 2020-01-04 0 10 75 75
使用可以通過以下方式完成的基礎 R
df$expected1 <- with(df, ave(const, product, FUN = cumsum) -
ave(value, product, FUN = cumsum))
和數據data.table
library(data.table)
setDT(df)[, expected1 := cumsum(const) - cumsum(value), product]
編輯
對於更新,我們可以創建一個新組並遵循相同的過程。
TD %>%
group_by(product, group = cumsum(value2 > value)) %>%
mutate(expected1 = cumsum(value2) - cumsum(value)) %>%
ungroup() %>%
select(-group)
# product data value value2 expected1
# <fct> <date> <dbl> <dbl> <dbl>
#1 A 2020-01-01 15 10 -5
#2 A 2020-01-02 1 0 -6
#3 A 2020-01-03 2 10 8
#4 A 2020-01-04 1 0 7
我們也可以在tidyverse
使用類似於@GKi 帖子中的ave
選項的單個cumsum
來執行此操作
library(dplyr)
df %>%
group_by(product) %>%
mutate(expected1 = cumsum(const - value))
這是一個base R
的解決方案,其中應用ave()
和cumsum()
來獲得expected
:
df
:dfs <- split(df,df$product)
df <- Reduce(rbind,lapply(dfs, function(x) {
within(x, expected <- ave(const-value,
ave(const-value,
cumsum(const>value),FUN = cumsum)>0,FUN = cumsum))
}))
以至於
> df
product data value const expected
1 A 2020-01-01 10 100 90
2 A 2020-01-02 15 0 75
3 A 2020-01-03 0 10 85
4 A 2020-01-04 5 0 80
5 B 2020-01-01 20 100 80
6 B 2020-01-02 5 0 75
7 B 2020-01-03 10 0 65
8 B 2020-01-04 0 10 75
DT
:您可以使用TDs <- split(TD,TD$product)
TD <- Reduce(rbind,lapply(dfs, function(x) {
within(x, expected <- ave(value2-value,
ave(value2-value,
cumsum(value2>value),FUN = cumsum)>0,FUN = cumsum))
}))
以至於
> TD
product data value value2 expected
1 A 2020-01-01 15 10 -5
2 A 2020-01-02 1 0 -6
3 A 2020-01-03 2 10 8
4 A 2020-01-04 1 0 7
5 A 2020-01-05 0 100 107
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.