將寬 dataframe 重塑為長格式

Question

我有以下格式的df：

姓名	其他信息	收入_2015	ebitda_2015	ebitda_2016	收入_2015	其他_2017
一個	信息1	1	2	3	4	5
乙	信息2	6	7	8	9	10
C	信息3	11	12	13	14	15

我想將其更改為長格式，並按以下方式構建：

姓名 | 信息 | 年份 | 指標名稱 | 價值

你能告訴我如何在 R 中做到這一點嗎？ 既然真正的dataframe有300多列，有沒有辦法自動創建年份列呢？

數據：


structure(list(name = structure(1:3, .Label = c("A", "B", "C"
), class = "factor"), other_info = structure(1:3, .Label = c("Info1", 
"Info2", "Info3"), class = "factor"), revenues_2015 = structure(c(1L, 
3L, 2L), .Label = c("1", "11", "6"), class = "factor"), ebitda_2015 = structure(c(2L, 
3L, 1L), .Label = c("12", "2", "7"), class = "factor"), ebitda_2016 = structure(c(2L, 
3L, 1L), .Label = c("13", "3", "8"), class = "factor"), revenues_2015 = structure(c(2L, 
3L, 1L), .Label = c("14", "4", "9"), class = "factor"), other_2017 = structure(c(3L, 
1L, 2L), .Label = c("10", "15", "5"), class = "factor")), class = "data.frame", row.names = c(NA, 
-3L))

Answer 1

這對你有用嗎？

library(dplyr)
library(tidyr)

structure(list(name = structure(1:3, .Label = c("A", "B", "C"
), class = "factor"), other_info = structure(1:3, .Label = c("Info1", 
"Info2", "Info3"), class = "factor"), revenues_2015 = structure(c(1L, 
3L, 2L), .Label = c("1", "11", "6"), class = "factor"), ebitda_2015 = structure(c(2L, 
3L, 1L), .Label = c("12", "2", "7"), class = "factor"), ebitda_2016 = structure(c(2L, 
3L, 1L), .Label = c("13", "3", "8"), class = "factor"), revenues_2015 = structure(c(2L, 
3L, 1L), .Label = c("14", "4", "9"), class = "factor"), other_2017 = structure(c(3L, 
1L, 2L), .Label = c("10", "15", "5"), class = "factor")), class = "data.frame", row.names = c(NA, 
-3L)) %>% 
  pivot_longer(revenues_2015:other_2017, names_pattern = "(.+)_(\\d{4})", names_to = c("metric", "year"))

Answer 2

您有兩個選擇，您可以使用實用工具 package（base-r 函數，您不必使用 library() 調用它）或從 reshape2 ZEFE90A8E604A7C840E8ZD03A 熔化 function

使用function reshape() ：

 data = structure(list(name = structure(1:3, .Label = c("A", "B", "C"
), class = "factor"), other_info = structure(1:3, .Label = c("Info1", 
"Info2", "Info3"), class = "factor"), revenues_2015 = structure(c(1L, 
3L, 2L), .Label = c("1", "11", "6"), class = "factor"), ebitda_2015 = structure(c(2L, 
3L, 1L), .Label = c("12", "2", "7"), class = "factor"), ebitda_2016 = structure(c(2L, 
3L, 1L), .Label = c("13", "3", "8"), class = "factor"), revenues_2015 = structure(c(2L, 
3L, 1L), .Label = c("14", "4", "9"), class = "factor"), other_2017 = structure(c(3L, 
1L, 2L), .Label = c("10", "15", "5"), class = "factor")), class = "data.frame", row.names = c(NA, 
-3L))

LF_data = reshape(data=data, idvar = c("name","other_info"), varying =c("revenues_2015","ebitda_2015","ebitda_2016","revenues_2015","other_2017"), 
    v.names = c("Value"),times=c("revenues_2015","ebitda_2015","ebitda_2016","revenues_2015","other_2017"), direction = "long")

使用package reshape2 melt() function：

首先，您需要具有屬性 stringsAsFactor = False 的 dataframe

       data=data.frame(structure(list(name = structure(1:3, .Label = c("A", "B", "C"
        ), class = "factor"), other_info = structure(1:3, .Label = c("Info1", 
        "Info2", "Info3"), class = "factor"), revenues_2015 = structure(c(1L, 
        3L, 2L), .Label = c("1", "11", "6"), class = "factor"), ebitda_2015 = structure(c(2L, 
        3L, 1L), .Label = c("12", "2", "7"), class = "factor"), ebitda_2016 = structure(c(2L, 
        3L, 1L), .Label = c("13", "3", "8"), class = "factor"), revenues_2015 = structure(c(2L, 
        3L, 1L), .Label = c("14", "4", "9"), class = "factor"), other_2017 = structure(c(3L, 
        1L, 2L), .Label = c("10", "15", "5"), class = "factor")), class = "data.frame", row.names = c(NA, 
        -3L)),stringsAsFactors=False)

 2. Then:

LF_data=reshape2::melt(data,id.vars=c("name","other_info"), mesure.vars=c("revenues_2015","ebitda_2015","ebitda_2016","revenues_2015","other_2017"))

除非它們是唯一的，否則融化不會讓您擁有“名稱”、“其他信息”和“變量”的組合。 在您的示例中，它將第二個三元組的收入_2015 更改為收入_2015.1

Answer 3

有點太晚了：類似於-mad-statter 解決方案。 使用 mutate 略有不同：

library(tidyr)
library(dplyr)

df <- structure(list(name = structure(1:3, .Label = c("A", "B", "C"
), class = "factor"), other_info = structure(1:3, .Label = c("Info1", 
"Info2", "Info3"), class = "factor"), revenues_2015 = structure(c(1L, 
3L, 2L), .Label = c("1", "11", "6"), class = "factor"), ebitda_2015 = structure(c(2L, 
3L, 1L), .Label = c("12", "2", "7"), class = "factor"), ebitda_2016 = structure(c(2L, 
3L, 1L), .Label = c("13", "3", "8"), class = "factor"), revenues_2015 = structure(c(2L, 
3L, 1L), .Label = c("14", "4", "9"), class = "factor"), other_2017 = structure(c(3L, 
1L, 2L), .Label = c("10", "15", "5"), class = "factor")), class = "data.frame", row.names = c(NA, -3L)) %>% 
  pivot_longer(revenues_2015:other_2017, names_to = c("Metric name", "Year"),
               names_sep ="_", values_to = "Value") %>% 
  dplyr::mutate(Year = stringr::str_remove(Year, "\\D")) %>% 
  rename(Name=name, Info = other_info)

將寬 dataframe 重塑為長格式

問題描述

3 個解決方案

解決方案1
1 已采納 2021-01-27 19:13:34

解決方案2
1 2021-01-27 19:19:19

解決方案3
0 2021-01-27 19:45:38

將寬 dataframe 重塑為長格式

問題描述

3 個解決方案

解決方案1 1 已采納 2021-01-27 19:13:34

解決方案2 1 2021-01-27 19:19:19

解決方案3 0 2021-01-27 19:45:38

解決方案1
1 已采納 2021-01-27 19:13:34

解決方案2
1 2021-01-27 19:19:19

解決方案3
0 2021-01-27 19:45:38