[英]Reshaping Data in R - Creating new columns based on values in an existing column
因此,我正在研究R中的一个问题,其中我有一个数据框,该数据框的一列包含一系列变量名:
*Name* *id_key* *detail* *var_names* *values*
Jose 123 red foo abc
Jose 123 blue foo abc
Jose 123 green foo abc
Mel 456 red bar 555
Mel 456 green bar 555
Dom 789 yellow choo fjfj55bar
我想实现以下目标:
*Name* *id_key* *detail* *foo* *bar* *choo*
Jose 123 red abc NA NA
Jose 123 blue abc NA NA
Jose 123 green abc NA NA
Mel 456 red NA 555 NA
Mel 456 green NA 555 NA
Dom 789 yellow NA NA fjfj55bar
我尝试通过以下命令从reshape2包中使用dcast-但未产生预期的结果:
toy_data_unmelt <- dcast(toy_data, formula = name~var_names, value.var = "values")
任何帮助将不胜感激!
reshape2
已被tidyr
取代。 ( reshape2
仍然可用,但是我将进行切换以使您的代码保持最新。)这是tidyr
解决方案:
library(tidyr)
toy_data <- read_table("*Name* *id_key* *detail* *var_names* *values*
Jose 123 red foo abc
Jose 123 blue foo abc
Jose 123 green foo abc
Mel 456 red bar 555
Mel 456 green bar 555
Dom 789 yellow choo fjfj55bar")
toy_data_wide <- spread(toy_data, `*var_names*`, `*values*`)
或者,使用管道运算符
toy_data_wide <- toy_data %>%
spread(`*var_names*`, `*values*`)
您将需要使用tidyr
包中的spread
功能:
library(tidyr)
toy_data = data.frame(Name = c("Jose", "Jose", "Jose", "Mel", "Mel", "Dom"),
id_key = c(123, 123, 123, 456, 456, 789),
detail = c("red", "blue", "green", "red", "green", "yellow"),
var_names = c("foo", "foo", "foo", "bar", "bar", "choo"),
values = c("abc", "abc", "abc", "555", "555", "fjfj55bar"))
toy_data %>% spread(var_names, values, fill = NA)
输出:
# Name id_key detail bar choo foo
#1 Dom 789 yellow <NA> fjfj55bar <NA>
#2 Jose 123 blue <NA> <NA> abc
#3 Jose 123 green <NA> <NA> abc
#4 Jose 123 red <NA> <NA> abc
#5 Mel 456 green 555 <NA> <NA>
#6 Mel 456 red 555 <NA> <NA>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.