计算所有行之间差异的最有效方法

Question

I have an R data.frame with the following data:我有一个带有以下数据的 R data.frame：

# A tibble: 21 x 57
# Groups:   section [21]
   section `1965` `1966` `1967` `1968` `1969`
   <fct>  <int>  <int>  <int>  <int>  <int>
 1 A          3     63    114    173    257
 2 B          2     88    114    147    169
 3 C         26    708    892   1101   1339
 4 D          1     16     16     20     77

In the complete data.frame the columns range from 1965->2020 and each row is a section A->U.在完整的 data.frame 中，列的范围是 1965->2020，每一行都是 A->U 部分。

I would like to add new columns to the right with the difference between successive columns: 1966-1965 data for each Section (Row), then 1967-1966 for each row, 1968-1967 and so on until 2020-2019 as the last new column.我想在右侧添加新列，其中连续列之间的差异：每个部分（行2020-2019的 1966-1965 数据，然后是每行的1967-1966等等，直到1968-1967作为最后一个新的柱子。

I have tried a few implementations of mutate_all() but to no success.我尝试了一些mutate_all()的实现，但没有成功。

Any suggestion is highly appreciated!任何建议都非常感谢！

Cheers干杯

Answer 1

We can t ranspose the data, get the diff我们无法t置数据，获取diff

cbind(df, t(diff(t(df[-1]))))
#  section 1965 1966 1967 1968 1969 1966 1967 1968 1969
#1       A    3   63  114  173  257   60   51   59   84
#2       B    2   88  114  147  169   86   26   33   22
#3       C   26  708  892 1101 1339  682  184  209  238
#4       D    1   16   16   20   77   15    0    4   57

data数据

df <- structure(list(section = c("A", "B", "C", "D"), `1965` = c(3L, 
2L, 26L, 1L), `1966` = c(63L, 88L, 708L, 16L), `1967` = c(114L, 
114L, 892L, 16L), `1968` = c(173L, 147L, 1101L, 20L), `1969` = c(257L, 
169L, 1339L, 77L)), class = "data.frame", row.names = c("1", 
"2", "3", "4"))

Answer 2

You can use apply to diff all the rows, then stick the result on to the right with cbind :您可以使用apply来cbind diff结果粘贴到右侧：

result <- cbind(df, t(apply(df[-1], 1, diff)))
result
#>   section 1965 1966 1967 1968 1969 1966 1967 1968 1969
#> 1       A    3   63  114  173  257   60   51   59   84
#> 2       B    2   88  114  147  169   86   26   33   22
#> 3       C   26  708  892 1101 1339  682  184  209  238
#> 4       D    1   16   16   20   77   15    0    4   57

Of course, you'll want to change the names as appropriate afterwards:当然，之后您需要根据需要更改名称：

names(result)[7:10] <- paste(1965:1968, 1966:1969, sep = "_")

as_tibble(result)
#> # A tibble: 4 x 10
#>   section `1965` `1966` `1967` `1968` `1969` `1965_1966` `1966_1967` `1967_1968`
#>   <chr>    <int>  <int>  <int>  <int>  <int>       <int>       <int>       <int>
#> 1 A            3     63    114    173    257          60          51          59
#> 2 B            2     88    114    147    169          86          26          33
#> 3 C           26    708    892   1101   1339         682         184         209
#> 4 D            1     16     16     20     77          15           0           4
#> # ... with 1 more variable: `1968_1969` <int>

Answer 3

You can use c_across() in dplyr and unnest_wider() in tidyr .您可以在 dplyr 中使用c_across() ，在dplyr中使用tidyr ( unnest_wider() 。

library(dplyr)
library(tidyr)

df %>%
  rowwise() %>%
  mutate(x = list(diff(c_across(`1965`:`1969`)))) %>%
  unnest_wider(x)

# # A tibble: 4 x 10
#   section `1965` `1966` `1967` `1968` `1969`  ...1  ...2  ...3  ...4
#   <chr>    <int>  <int>  <int>  <int>  <int> <int> <int> <int> <int>
# 1 A            3     63    114    173    257    60    51    59    84
# 2 B            2     88    114    147    169    86    26    33    22
# 3 C           26    708    892   1101   1339   682   184   209   238
# 4 D            1     16     16     20     77    15     0     4    57

Answer 4

Here is another base R option which used matrix product这是另一个使用矩阵产品的基本 R 选项

m <- -diag(ncol(df)-1)
m[cbind(2:ncol(m),1:(ncol(m)-1))]<-1
dfout <- cbind(df,as.matrix(df[-1])%*%m[,-ncol(m)])

which gives这使

> dfout
  section `1965` `1966` `1967` `1968` `1969`   1   2   3   4
1       A      3     63    114    173    257  60  51  59  84
2       B      2     88    114    147    169  86  26  33  22
3       C     26    708    892   1101   1339 682 184 209 238
4       D      1     16     16     20     77  15   0   4  57

计算所有行之间差异的最有效方法

问题描述

4 个解决方案

解决方案1
3 2020-07-15 17:28:18

data数据

解决方案2
1 已采纳 2020-07-15 16:11:41

解决方案3
1 2020-07-15 16:34:25

解决方案4
1 2020-07-15 19:24:45

计算所有行之间差异的最有效方法

问题描述

4 个解决方案

解决方案1 3 2020-07-15 17:28:18

data数据

解决方案2 1 已采纳 2020-07-15 16:11:41

解决方案3 1 2020-07-15 16:34:25

解决方案4 1 2020-07-15 19:24:45

解决方案1
3 2020-07-15 17:28:18

解决方案2
1 已采纳 2020-07-15 16:11:41

解决方案3
1 2020-07-15 16:34:25

解决方案4
1 2020-07-15 19:24:45