R：計算新的變量R代碼

Question

我有

            id_1 id_2  name  count total
          1  001  111    a     15  
          2  001  111    b      3   
          3  001  111   sum    28   28
          4  002  111    a      7  
          5  002  111    b     33
          6  002  111   sum    48   48

我希望共享相同id_1和id_2的行共享總數，例如

            id_1 id_2  name   count total
          1  001  111    a     15   28
          2  001  111    b      3   28
          3  001  111   sum    28   28
          4  002  111    a      7   48
          5  002  111    b     33   48
          6  002  111   sum    48   48

Answer 1

我們可以使用來自tidyr fill 。

library(tidyr)

dat2 <- dat %>% fill(total, .direction = "up")
dat2
#   id_1 id_2 name count total
# 1    1  111    a    15    28
# 2    1  111    b     3    28
# 3    1  111  sum    28    28
# 4    2  111    a     7    48
# 5    2  111    b    33    48
# 6    2  111  sum    48    48

數據

dat <- read.table(text = "            id_1 id_2  name  count total
          1  001  111    a     15   NA
          2  001  111    b      3   NA
          3  001  111   sum    28   28
          4  002  111    a      7   NA
          5  002  111    b     33   NA
          6  002  111   sum    48   48",
                  header = TRUE, stringsAsFactors = FALSE)

Answer 2

考慮基數R的ave計算組max （ na.rm來處理NA ）：

df$total <- ave(df$total, df$id_1, df$_id_2, FUN=function(i) max(i, na.rm=na.omit))

df
#   id_1 id_2 name count total
# 1    1  111    a    15    28
# 2    1  111    b     3    28
# 3    1  111  sum    28    28
# 4    2  111    a     7    48
# 5    2  111    b    33    48
# 6    2  111  sum    48    48

Answer 3

使用zoo和data.table ：

df <- read.table(text = "id_1 id_2  name  count total
            001  111    a     15  NA
                    001  111    b      3   NA
                    001  111   sum    28   28
                    002  111    a      7  NA
                    002  111    b     33   NA
                    002  111   sum    48   48",
                  header = TRUE, stringsAsFactors = FALSE)# create data
library(zoo)# load packages
library(data.table)
setDT(df)[, total := na.locf(na.locf(total, na.rm=FALSE), na.rm=FALSE, fromLast=TRUE), by = c("id_1", "id_2")]# convert df to data.table and carry forward and backward total by ids

輸出：

    id_1 id_2 name count total
1:    1  111    a    15    28
2:    1  111    b     3    28
3:    1  111  sum    28    28
4:    2  111    a     7    48
5:    2  111    b    33    48
6:    2  111  sum    48    48

Answer 4

使用普通dplyr方式的簡單方法：

dat %>% group_by(id_1, id_2) %>% mutate(total=count[name == "sum"])

或者：

dat %>% group_by(id_1, id_2) %>% mutate(total=na.omit(total)[1])

   id_1  id_2 name  count total
  <int> <int> <chr> <int> <int>
1     1   111 a        15    28
2     1   111 b         3    28
3     1   111 sum      28    28
4     2   111 a         7    48
5     2   111 b        33    48
6     2   111 sum      48    48

R：計算新的變量R代碼

問題描述

4 個解決方案

解決方案1
3 2018-06-27 18:30:44

解決方案2
1 2018-06-27 18:39:35

解決方案3
1 2018-06-27 19:32:49

解決方案4
1 2018-06-27 19:37:03

R：計算新的變量R代碼

問題描述

4 個解決方案

解決方案1 3 2018-06-27 18:30:44

解決方案2 1 2018-06-27 18:39:35

解決方案3 1 2018-06-27 19:32:49

解決方案4 1 2018-06-27 19:37:03

解決方案1
3 2018-06-27 18:30:44

解決方案2
1 2018-06-27 18:39:35

解決方案3
1 2018-06-27 19:32:49

解決方案4
1 2018-06-27 19:37:03