简体   繁体   中英

Use a previously defined column in data.table column assignment

Suppose I have a data.table with information on income , hours worked and the id of an individual.
I want to calculate the income per hour iph and then calculate for each individual the income development over time ( iphd ).

In the final data.table I want to store both variables iph and iphd .

data <- data.table(
  income = c(100, 120, 140, 205, 200, 220),
  hours =  c( 10,  11,  12,  18,  17,  21),
  id =     c(  1,   1,   1,   2,   2,   2)
)

(data
  [, iph := income / hours]
  [, iphd := c(NA, diff(iph)), by = id])[]

Being used to base R's within function, I would like to access iph right after its definition in the same expression. Something like:

# Trial no. 1
data[,
     `:=`(
       iph := income / hours,
       iphd := c(NA, diff(iph))),
     by = id][]

# Trial no. 2
data[, `:=`({
  iph = income / hours
  iphd = c(NA, diff(iph))
}), by = id][]

# Trial no. 3
data[, .({
  iph = income / hours
  iphd = c(NA, diff(iph))
}), by = id][]

However, none of these solutions works.
Is there a way to do this other than the two-step approach I suggested above?

calculate both between {...} and return results in a list

data[, c("iph", "iphd") := {
  iph <- income / hours
  iphd <- c(NA, diff(iph))
  list(iph,iphd)
}, by = id]

#    income hours id      iph       iphd
# 1:    100    10  1 10.00000         NA
# 2:    120    11  1 10.90909  0.9090909
# 3:    140    12  1 11.66667  0.7575758
# 4:    205    18  2 11.38889         NA
# 5:    200    17  2 11.76471  0.3758170
# 6:    220    21  2 10.47619 -1.2885154

without curly braces:

data[, c("iph", "iphd") := list(income / hours, 
                                c(NA, diff(income / hours))), by = id][]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM