[英]Reshaping Data in R - Creating new columns based on values in an existing column
因此,我正在研究R中的一個問題,其中我有一個數據框,該數據框的一列包含一系列變量名:
*Name* *id_key* *detail* *var_names* *values*
Jose 123 red foo abc
Jose 123 blue foo abc
Jose 123 green foo abc
Mel 456 red bar 555
Mel 456 green bar 555
Dom 789 yellow choo fjfj55bar
我想實現以下目標:
*Name* *id_key* *detail* *foo* *bar* *choo*
Jose 123 red abc NA NA
Jose 123 blue abc NA NA
Jose 123 green abc NA NA
Mel 456 red NA 555 NA
Mel 456 green NA 555 NA
Dom 789 yellow NA NA fjfj55bar
我嘗試通過以下命令從reshape2包中使用dcast-但未產生預期的結果:
toy_data_unmelt <- dcast(toy_data, formula = name~var_names, value.var = "values")
任何幫助將不勝感激!
reshape2
已被tidyr
取代。 ( reshape2
仍然可用,但是我將進行切換以使您的代碼保持最新。)這是tidyr
解決方案:
library(tidyr)
toy_data <- read_table("*Name* *id_key* *detail* *var_names* *values*
Jose 123 red foo abc
Jose 123 blue foo abc
Jose 123 green foo abc
Mel 456 red bar 555
Mel 456 green bar 555
Dom 789 yellow choo fjfj55bar")
toy_data_wide <- spread(toy_data, `*var_names*`, `*values*`)
或者,使用管道運算符
toy_data_wide <- toy_data %>%
spread(`*var_names*`, `*values*`)
您將需要使用tidyr
包中的spread
功能:
library(tidyr)
toy_data = data.frame(Name = c("Jose", "Jose", "Jose", "Mel", "Mel", "Dom"),
id_key = c(123, 123, 123, 456, 456, 789),
detail = c("red", "blue", "green", "red", "green", "yellow"),
var_names = c("foo", "foo", "foo", "bar", "bar", "choo"),
values = c("abc", "abc", "abc", "555", "555", "fjfj55bar"))
toy_data %>% spread(var_names, values, fill = NA)
輸出:
# Name id_key detail bar choo foo
#1 Dom 789 yellow <NA> fjfj55bar <NA>
#2 Jose 123 blue <NA> <NA> abc
#3 Jose 123 green <NA> <NA> abc
#4 Jose 123 red <NA> <NA> abc
#5 Mel 456 green 555 <NA> <NA>
#6 Mel 456 red 555 <NA> <NA>
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.