How do I take a subset of a column and make it into a new column with 0s instead of NAs for missing values in R?

Question

Let's say I have a column of values and another column of factors:

    vals factors
 1:   58       B
 2:   42       B
 3:   64       A
 4:   64       A
 5:   26       B
 6:   64       A
 7:   20       A
 8:   20       A
 9:   22       A
10:   29       B
11:   60       B
12:   41       A
13:   79       A
14:   82       A
15:   11       A
16:   97       A
17:    1       B
18:   29       B
19:   90       B
20:    2       A

I want to create a new column for each factor with the vals value still in the corresponding row and a 0 for rows that that are for the other factors . So, to do this manually, it would look like this:

    vals factors     A
 1:   58       B     0
 2:   42       B     0
 3:   64       A    64
 4:   64       A    64
 5:   26       B     0
 6:   64       A    64
 7:   20       A    20
 8:   20       A    20
 9:   22       A    22
10:   29       B     0

and the same for all factors. How would I go about doing this? I tried the simplest way, which is to do:

dt$A <-  dt[factors == "A",]$vals

But that very predictably doesn't work:

Error in set(x, j = name, value = value) : 
  Supplied 12 items to be assigned to 20 items of column 'A'. If you wish to 'recycle' the RHS please use rep() to make this intent clear to readers of your code.

Answer 1

in base R you would do:

cbind(df, as.data.frame.matrix(xtabs(vals~.,cbind(x=1:nrow(df), df))))

    vals factors  A  B
1:    58       B  0 58
2:    42       B  0 42
3:    64       A 64  0
4:    64       A 64  0
5:    26       B  0 26
6:    64       A 64  0
7:    20       A 20  0
8:    20       A 20  0
9:    22       A 22  0
10:   29       B  0 29
11:   60       B  0 60
12:   41       A 41  0
13:   79       A 79  0
14:   82       A 82  0
15:   11       A 11  0
16:   97       A 97  0
17:    1       B  0  1
18:   29       B  0 29
19:   90       B  0 90
20:    2       A  2  0

Another way in base R will be:

cbind(df, model.matrix(~0+factors, df)* df$vals)

    vals factors factorsA factorsB
1:    58       B        0       58
2:    42       B        0       42
3:    64       A       64        0
4:    64       A       64        0
5:    26       B        0       26
6:    64       A       64        0
7:    20       A       20        0
8:    20       A       20        0
9:    22       A       22        0
10:   29       B        0       29
11:   60       B        0       60
12:   41       A       41        0
13:   79       A       79        0
14:   82       A       82        0
15:   11       A       11        0
16:   97       A       97        0
17:    1       B        0        1
18:   29       B        0       29
19:   90       B        0       90
20:    2       A        2        0

Answer 2

Using dplyr you could mutate the new variables with an ifelse statement where if the name of the factor is equal to the name of the variable, then show the vals value, else, get a zero.

library(dplyr)

df <- df %>% mutate(A = ifelse(factors == "A", vals, 0),
                    B = ifelse(factors == "B", vals, 0))


   vals factors  A  B
1    58       B  0 58
2    42       B  0 42
3    64       A 64  0
4    64       A 64  0
5    26       B  0 26
6    64       A 64  0
7    20       A 20  0
8    20       A 20  0
9    22       A 22  0
10   29       B  0 29
11   60       B  0 60
12   41       A 41  0
13   79       A 79  0
14   82       A 82  0
15   11       A 11  0
16   97       A 97  0
17    1       B  0  1
18   29       B  0 29
19   90       B  0 90
20    2       A  2  0

How do I take a subset of a column and make it into a new column with 0s instead of NAs for missing values in R?

Question

2 answers

solution1
2 2020-11-12 18:37:47

solution2
1 ACCPTED 2020-11-12 18:40:29

How do I take a subset of a column and make it into a new column with 0s instead of NAs for missing values in R?

Question

2 answers

solution1 2 2020-11-12 18:37:47

solution2 1 ACCPTED 2020-11-12 18:40:29

solution1
2 2020-11-12 18:37:47

solution2
1 ACCPTED 2020-11-12 18:40:29