The goal I'm trying to achieve is to take a data frame column which is a factor, create a new column for each level and populate the column with the appropriate value for that level from the original data frame.
Here is a sample. In this case, I want to create a new column for each level of the the.name
factor column, like so:
Original dataframe:
symbol the.name cn
SYM1 ABC 1
SYM2 ABC 2
SYM1 DEF 3
SYM2 DEF 4
Resulting dataframe:
symbol ABC DEF
SYM1 1 3
SYM2 2 4
How can this be done?
EDIT: I have tried to achieve this using a sapply
loop with split
by the column and then rbind
ing the results. However, I have not gotten it to work and chose not to add it into this question as it would generate noise - I'm pretty sure that method is not correct and can be considerably improved.
This is a job for dcast
from the package reshape2
:
> dcast(df, symbol~the.name, value.var="cn")
symbol ABC DEF
1 SYM1 1 3
2 SYM2 2 4
This is a reshaping task (from long to wide data). The package reshape2
has some great utilities to do this.
txt="symbol the.name cn
SYM1 ABC 1
SYM2 ABC 2
SYM1 DEF 3
SYM2 DEF 4"
tmp <- read.table(text=txt, header=TRUE)
library(reshape2)
dcast(tmp, symbol ~ the.name) ## as easy as that
Alternatively, the newish tidyr
package provides does this with the "spread" function. Using @ilir's data
> tidyr::spread(tmp, key = the.name, value = cn)
symbol ABC DEF
1 SYM1 1 3
2 SYM2 2 4
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.