How to create a table in R populated with 1s and 0s to show presence of values from another table?

Question

I'm working with data regarding people and what class of medicine they were prescribed. It looks something like this (the actual data is read in via txt file):

test <- matrix(c(1,"a",1,"a",1,"b",2,"a",2,"c"),ncol=2,byrow=TRUE)
colnames(test) <- c("id","med")
test <- as.data.table(test)
test <- unique(test[, 1:2])
test

The table has about 5 million rows, 45k unique patients, and 49 unique medicines. Some patients have multiples of the same medicines, which I remove. Not all patients have every medicine. I want to make each of the 49 unique medicines into separate columns, and have each unique patient be a row, and populate the table with 1s and 0s to show if the patient has the medicine or not.

I was trying to use spread or dcast, but there's no value column. I tried to amend this by adding a row of 1s

test$true <- rep(1, nrow(test))

And then using tidyr

library(tidyr)
test_wide <- spread(test, med, true, fill = 0)

My original data produced this error but I'm not sure why the new data isn't reproducing it...

Error: `var` must evaluate to a single number or a column name, not a list

Please let me know what I can do to make this a better reproducible example sorry I'm really new to this.

Answer 1

Another solution using dplyr

library(dplyr)
test %>% group_by(id) %>% table()

Answer 2

It looks like you are trying to do onehot encoding here. For this please refer to the "onehot" package. Details are here .

Code for reference:

library(onehot)
test <- matrix(c(1,"a",1,"a",1,"b",2,"a",2,"c"),ncol=2,byrow=TRUE)
colnames(test) <- c("id","med")
test <- as.data.frame(test)

str(test)
test$id <- as.numeric(test$id)
str(test)
encoder <- onehot(test)
finaldata <- predict(encoder,test)
finaldata

Make sure that all the columns that you want to be encoded are of the type factor . Also, I have taken the liberty of changing data.table to data.frame .

How to create a table in R populated with 1s and 0s to show presence of values from another table?

Question

2 answers

solution1
0 2018-07-19 18:18:46

solution2
-1 2018-07-19 17:54:21

How to create a table in R populated with 1s and 0s to show presence of values from another table?

Question

2 answers

solution1 0 2018-07-19 18:18:46

solution2 -1 2018-07-19 17:54:21

solution1
0 2018-07-19 18:18:46

solution2
-1 2018-07-19 17:54:21