簡體   English   中英

在R中重塑數據-根據現有列中的值創建新列

[英]Reshaping Data in R - Creating new columns based on values in an existing column

因此,我正在研究R中的一個問題,其中我有一個數據框,該數據框的一列包含一系列變量名:

*Name*   *id_key*   *detail*    *var_names*  *values*
 Jose      123        red         foo          abc
 Jose      123        blue        foo          abc
 Jose      123        green       foo          abc
 Mel       456        red         bar          555
 Mel       456        green       bar          555
 Dom       789        yellow      choo         fjfj55bar

我想實現以下目標:

*Name*   *id_key*   *detail*   *foo*    *bar*   *choo*
 Jose      123        red       abc      NA      NA
 Jose      123        blue      abc      NA      NA 
 Jose      123        green     abc      NA      NA
 Mel       456        red       NA       555     NA
 Mel       456        green     NA       555     NA
 Dom       789        yellow    NA       NA      fjfj55bar

我嘗試通過以下命令從reshape2包中使用dcast-但未產生預期的結果:

toy_data_unmelt <- dcast(toy_data, formula = name~var_names, value.var = "values")

任何幫助將不勝感激!

reshape2已被tidyr取代。 reshape2仍然可用,但是我將進行切換以使您的代碼保持最新。)這是tidyr解決方案:

library(tidyr)
toy_data <- read_table("*Name*   *id_key*   *detail*    *var_names*  *values*
 Jose      123        red         foo          abc
  Jose      123        blue        foo          abc
  Jose      123        green       foo          abc
  Mel       456        red         bar          555
  Mel       456        green       bar          555
  Dom       789        yellow      choo         fjfj55bar")
toy_data_wide <- spread(toy_data, `*var_names*`, `*values*`)

或者,使用管道運算符

toy_data_wide <- toy_data %>%
  spread(`*var_names*`, `*values*`)

您將需要使用tidyr包中的spread功能:

library(tidyr)

toy_data = data.frame(Name = c("Jose", "Jose", "Jose", "Mel", "Mel", "Dom"), 
                      id_key = c(123, 123, 123, 456, 456, 789),
                      detail = c("red", "blue", "green", "red", "green", "yellow"), 
                      var_names = c("foo", "foo", "foo", "bar", "bar", "choo"),
                      values = c("abc", "abc", "abc", "555", "555", "fjfj55bar"))

toy_data %>% spread(var_names, values, fill = NA)

輸出:

#  Name id_key detail  bar      choo  foo
#1  Dom    789 yellow <NA> fjfj55bar <NA>
#2 Jose    123   blue <NA>      <NA>  abc
#3 Jose    123  green <NA>      <NA>  abc
#4 Jose    123    red <NA>      <NA>  abc
#5  Mel    456  green  555      <NA> <NA>
#6  Mel    456    red  555      <NA> <NA>

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM