简体   繁体   English

通过多行将数据从宽到长整形

[英]reshape data from wide to long with multiple rows

I have a dataset dfs that i would like to reshape 我有一个要重塑的数据集DFS

dfs
#      country.name                     indicator.name         x1990         x1991         x1992
# 507       andorra GDP at market prices (current US$)  1.028989e+09  1.106891e+09  1.209993e+09
# 510       andorra              GDP growth (annual %)  3.781393e+00  2.546001e+00  9.292154e-01
# 1347      albania GDP at market prices (current US$)  2.101625e+09  1.139167e+09  7.094526e+08
# 1350      albania              GDP growth (annual %) -9.575640e+00 -2.958900e+01 -7.200000e+00
# 3587      austria GDP at market prices (current US$)  1.660624e+11  1.733755e+11  1.946082e+11

And i would like it so that the indicator names are columns and the times are in one column with an indicator. 我希望它使指标名称为一列,并且时间在一个指标的一列中。

#   country time   gdp_market gdp_growth
# 1 andorra 1990   1028989394  3.7813935
# 2 andorra 1990   1106891025  2.5460006
# 3 andorra 1990   1209992650  0.9292154
# 4 albania 1991   2101624963  3.7813935
# 5 albania 1991   1139166646  2.5460006
# 6 albania 1991    709452584  0.9292154
# 7 austria 1992 166062376740         NA
# 8 austria 1992 173375508073         NA
# 9 austria 1992 194608183696         NA

I can melt reshape the data into long format but cant seperate it into two columns 我可以将数据重塑为长格式,但不能将其分成两列

library(reshape2)
melt.dfs <- melt(dfs, id=1:2)

I could do a split and cbind, but id prefer to do it with reshape. 我可以进行拆分和绑定,但是id更喜欢通过重塑来实现。 Thanks 谢谢

dfs = structure(list(country.name = c("andorra", "andorra", "albania", 
"albania", "austria"), indicator.name = c("GDP at market prices (current US$)", 
"GDP growth (annual %)", "GDP at market prices (current US$)", 
"GDP growth (annual %)", "GDP at market prices (current US$)"
), x1990 = c(1028989393.70295, 3.78139347786568, 2101624962.5, 
-9.57564018741695, 166062376739.683), x1991 = c(1106891024.78653, 
2.54600064090229, 1139166645.83333, -29.5889976817695, 173375508073.07
), x1992 = c(1209992649.56688, 0.929215382801402, 709452583.880319, 
-7.19999998650893, 194608183696.469)), .Names = c("country.name", 
"indicator.name", "x1990", "x1991", "x1992"), row.names = c(507L, 
510L, 1347L, 1350L, 3587L), class = "data.frame")

We can use 我们可以用

library(dplyr)
library(tidyr)
gather(dfs, time, Val, x1990:x1992) %>% 
       spread(indicator.name, Val)

EDIT: Based on comments from @docendo discimus 编辑:基于@docendo discimus的评论


Or using recast 或使用recast

library(reshape2)
recast(dfs, measure = 3:5, ...~indicator.name, value.var='value')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM