简体   繁体   中英

Adding new column with diff() function when there is one less row in R

If I have a sample data frame like mtcars, and I want to find the difference between mtcars$qsec for all rows, I can do diff(mtcars$qsec). But is there a simple way to make diff(mtcars$qsec) a new column in the original mtcars data frame? I'm finding it difficult because there's one less row in diff(mtcars$qsec) than the rest of mtcars.

> head(mtcars,3)

               mpg cyl  disp  hp drat    wt  qsec vs am gear carb
Mazda RX4     21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag 21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
Datsun 710    22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1

Here are two approaches. Both put an NA in the first row of diff_qsec and put diff(qsec) in the remaining rows:

library(dplyr)  
mtcars %>% mutate(diff_qsec = qsec - lag(qsec)) # dplyr has its own version of lag

transform(mtcars, diff_qsec = c(NA, diff(qsec)))

Also, on the general issue of padding see: How can I pad a vector with NA from the front?

You could use the base function within() like so:

mtcars <- within(mtcars, difference <- c(NA,diff(qsec)))

This creates a column called "difference" with the first element NA and the rest calculated by diff(qsec).

You could create more columns at the same time by wrapping commands in {}, such as:

mtcars <- within(mtcars, {difference <- c(NA,diff(qsec))
                         multiple <- qsec*2})

Note that you must use <- for the assignment and not =.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM