[英]Reshaping large dataset in R
我正在嘗試重塑大型數據集,並遇到無法按我想要的正確順序獲得結果的問題。
數據如下所示:
GeoFIPS GeoName IndustryID Description X2001 X2002 X2003 X2004 X2005
10180 Abilene, TX 21 Mining 96002 92407 127138 150449 202926
10180 Abilene, TX 22 Utilities 33588 34116 33105 33265 32452
...
該數據幀非常長,包含美國特定行業部門的所有MSA。
我希望它看起來像這樣:
GeoFIPS GeoName Year Mining Utilities (etc)
10180 Abilene, TX 2001 96002 33588
10180 Abilene, TX 2002 92407 34116
....
我對R很陌生,非常感謝您的幫助。 我已經檢查了從寬到長,從長到寬,但這似乎是一個更復雜的情況。 謝謝!
編輯: 數據
df1 <- structure(list(GeoFIPS = c(10180L, 10180L), GeoName =
c("Abilene, TX",
"Abilene, TX"), IndustryID = 21:22, Description = c("Mining",
"Utilities"), X2001 = c(96002L, 33588L), X2002 = c(92407L, 34116L
), X2003 = c(127138L, 33105L), X2004 = c(150449L, 33265L), X2005 =
c(202926L,
32452L)), .Names = c("GeoFIPS", "GeoName", "IndustryID", "Description",
"X2001", "X2002", "X2003", "X2004", "X2005"), class = "data.frame",
row.names = c(NA, -2L))
你可以使用melt/dcast
從reshape2
library(reshape2)
df2 <- melt(df1, id.var=c('GeoFIPS', 'GeoName',
'IndustryID', 'Description'))
df2 <- transform(df2, Year=sub('^X', '', variable))[-c(3,5)]
dcast(df2, ...~Description, value.var='value')
# GeoFIPS GeoName Year Mining Utilities
#1 10180 Abilene, TX 2001 96002 33588
#2 10180 Abilene, TX 2002 92407 34116
#3 10180 Abilene, TX 2003 127138 33105
#4 10180 Abilene, TX 2004 150449 33265
#5 10180 Abilene, TX 2005 202926 32452
df1 <- structure(list(GeoFIPS = c(10180L, 10180L), GeoName =
c("Abilene, TX",
"Abilene, TX"), IndustryID = 21:22, Description = c("Mining",
"Utilities"), X2001 = c(96002L, 33588L), X2002 = c(92407L, 34116L
), X2003 = c(127138L, 33105L), X2004 = c(150449L, 33265L), X2005 =
c(202926L,
32452L)), .Names = c("GeoFIPS", "GeoName", "IndustryID", "Description",
"X2001", "X2002", "X2003", "X2004", "X2005"), class = "data.frame",
row.names = c(NA, -2L))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.