I have an R data.frame with the following data:
# A tibble: 21 x 57
# Groups: section [21]
section `1965` `1966` `1967` `1968` `1969`
<fct> <int> <int> <int> <int> <int>
1 A 3 63 114 173 257
2 B 2 88 114 147 169
3 C 26 708 892 1101 1339
4 D 1 16 16 20 77
In the complete data.frame the columns range from 1965->2020 and each row is a section A->U.
I would like to add new columns to the right with the difference between successive columns: 1966-1965 data for each Section (Row), then 1967-1966
for each row, 1968-1967
and so on until 2020-2019
as the last new column.
I have tried a few implementations of mutate_all()
but to no success.
Any suggestion is highly appreciated!
Cheers
We can t
ranspose the data, get the diff
cbind(df, t(diff(t(df[-1]))))
# section 1965 1966 1967 1968 1969 1966 1967 1968 1969
#1 A 3 63 114 173 257 60 51 59 84
#2 B 2 88 114 147 169 86 26 33 22
#3 C 26 708 892 1101 1339 682 184 209 238
#4 D 1 16 16 20 77 15 0 4 57
df <- structure(list(section = c("A", "B", "C", "D"), `1965` = c(3L,
2L, 26L, 1L), `1966` = c(63L, 88L, 708L, 16L), `1967` = c(114L,
114L, 892L, 16L), `1968` = c(173L, 147L, 1101L, 20L), `1969` = c(257L,
169L, 1339L, 77L)), class = "data.frame", row.names = c("1",
"2", "3", "4"))
You can use apply
to diff
all the rows, then stick the result on to the right with cbind
:
result <- cbind(df, t(apply(df[-1], 1, diff)))
result
#> section 1965 1966 1967 1968 1969 1966 1967 1968 1969
#> 1 A 3 63 114 173 257 60 51 59 84
#> 2 B 2 88 114 147 169 86 26 33 22
#> 3 C 26 708 892 1101 1339 682 184 209 238
#> 4 D 1 16 16 20 77 15 0 4 57
Of course, you'll want to change the names as appropriate afterwards:
names(result)[7:10] <- paste(1965:1968, 1966:1969, sep = "_")
as_tibble(result)
#> # A tibble: 4 x 10
#> section `1965` `1966` `1967` `1968` `1969` `1965_1966` `1966_1967` `1967_1968`
#> <chr> <int> <int> <int> <int> <int> <int> <int> <int>
#> 1 A 3 63 114 173 257 60 51 59
#> 2 B 2 88 114 147 169 86 26 33
#> 3 C 26 708 892 1101 1339 682 184 209
#> 4 D 1 16 16 20 77 15 0 4
#> # ... with 1 more variable: `1968_1969` <int>
You can use c_across()
in dplyr
and unnest_wider()
in tidyr
.
library(dplyr)
library(tidyr)
df %>%
rowwise() %>%
mutate(x = list(diff(c_across(`1965`:`1969`)))) %>%
unnest_wider(x)
# # A tibble: 4 x 10
# section `1965` `1966` `1967` `1968` `1969` ...1 ...2 ...3 ...4
# <chr> <int> <int> <int> <int> <int> <int> <int> <int> <int>
# 1 A 3 63 114 173 257 60 51 59 84
# 2 B 2 88 114 147 169 86 26 33 22
# 3 C 26 708 892 1101 1339 682 184 209 238
# 4 D 1 16 16 20 77 15 0 4 57
Here is another base R option which used matrix product
m <- -diag(ncol(df)-1)
m[cbind(2:ncol(m),1:(ncol(m)-1))]<-1
dfout <- cbind(df,as.matrix(df[-1])%*%m[,-ncol(m)])
which gives
> dfout
section `1965` `1966` `1967` `1968` `1969` 1 2 3 4
1 A 3 63 114 173 257 60 51 59 84
2 B 2 88 114 147 169 86 26 33 22
3 C 26 708 892 1101 1339 682 184 209 238
4 D 1 16 16 20 77 15 0 4 57
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.