簡體   English   中英

對於 R 中的缺失值,如何獲取一個列的子集並將其放入一個包含 0 而不是 NA 的新列中?

[英]How do I take a subset of a column and make it into a new column with 0s instead of NAs for missing values in R?

假設我有一列值和另一列因素:

    vals factors
 1:   58       B
 2:   42       B
 3:   64       A
 4:   64       A
 5:   26       B
 6:   64       A
 7:   20       A
 8:   20       A
 9:   22       A
10:   29       B
11:   60       B
12:   41       A
13:   79       A
14:   82       A
15:   11       A
16:   97       A
17:    1       B
18:   29       B
19:   90       B
20:    2       A

我想為每個因素創建一個新列,其vals值仍在相應的行中,而其他factors的行則為0 因此,要手動執行此操作,它看起來像這樣:

    vals factors     A
 1:   58       B     0
 2:   42       B     0
 3:   64       A    64
 4:   64       A    64
 5:   26       B     0
 6:   64       A    64
 7:   20       A    20
 8:   20       A    20
 9:   22       A    22
10:   29       B     0

所有因素都一樣。 我將如何 go 這樣做? 我嘗試了最簡單的方法,就是這樣做:

dt$A <-  dt[factors == "A",]$vals

但這非常出乎意料地行不通:

Error in set(x, j = name, value = value) : 
  Supplied 12 items to be assigned to 20 items of column 'A'. If you wish to 'recycle' the RHS please use rep() to make this intent clear to readers of your code.

在基地 R 你會做:

cbind(df, as.data.frame.matrix(xtabs(vals~.,cbind(x=1:nrow(df), df))))

    vals factors  A  B
1:    58       B  0 58
2:    42       B  0 42
3:    64       A 64  0
4:    64       A 64  0
5:    26       B  0 26
6:    64       A 64  0
7:    20       A 20  0
8:    20       A 20  0
9:    22       A 22  0
10:   29       B  0 29
11:   60       B  0 60
12:   41       A 41  0
13:   79       A 79  0
14:   82       A 82  0
15:   11       A 11  0
16:   97       A 97  0
17:    1       B  0  1
18:   29       B  0 29
19:   90       B  0 90
20:    2       A  2  0

base R 中的另一種方式是:

cbind(df, model.matrix(~0+factors, df)* df$vals)

    vals factors factorsA factorsB
1:    58       B        0       58
2:    42       B        0       42
3:    64       A       64        0
4:    64       A       64        0
5:    26       B        0       26
6:    64       A       64        0
7:    20       A       20        0
8:    20       A       20        0
9:    22       A       22        0
10:   29       B        0       29
11:   60       B        0       60
12:   41       A       41        0
13:   79       A       79        0
14:   82       A       82        0
15:   11       A       11        0
16:   97       A       97        0
17:    1       B        0        1
18:   29       B        0       29
19:   90       B        0       90
20:    2       A        2        0

使用dplyr ,您可以使用ifelse語句mutate新變量,其中如果factor的名稱等於變量的名稱,則顯示vals值,否則,獲取零。

library(dplyr)

df <- df %>% mutate(A = ifelse(factors == "A", vals, 0),
                    B = ifelse(factors == "B", vals, 0))


   vals factors  A  B
1    58       B  0 58
2    42       B  0 42
3    64       A 64  0
4    64       A 64  0
5    26       B  0 26
6    64       A 64  0
7    20       A 20  0
8    20       A 20  0
9    22       A 22  0
10   29       B  0 29
11   60       B  0 60
12   41       A 41  0
13   79       A 79  0
14   82       A 82  0
15   11       A 11  0
16   97       A 97  0
17    1       B  0  1
18   29       B  0 29
19   90       B  0 90
20    2       A  2  0

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM