I am trying to reshape a large dataset and have a problem not getting the results in the right order as I want to.
Here is what the data looks like:
GeoFIPS GeoName IndustryID Description X2001 X2002 X2003 X2004 X2005
10180 Abilene, TX 21 Mining 96002 92407 127138 150449 202926
10180 Abilene, TX 22 Utilities 33588 34116 33105 33265 32452
...
The dataframe is very long and includes all MSAs in the US with selected Industry sectors.
The way I wanted it to look like is this:
GeoFIPS GeoName Year Mining Utilities (etc)
10180 Abilene, TX 2001 96002 33588
10180 Abilene, TX 2002 92407 34116
....
I am pretty new to R and would really appreciate your help. I have checked wide to long and long to wide, but this seems to be a more complex situation. Thank you!
Edit: Data
df1 <- structure(list(GeoFIPS = c(10180L, 10180L), GeoName =
c("Abilene, TX",
"Abilene, TX"), IndustryID = 21:22, Description = c("Mining",
"Utilities"), X2001 = c(96002L, 33588L), X2002 = c(92407L, 34116L
), X2003 = c(127138L, 33105L), X2004 = c(150449L, 33265L), X2005 =
c(202926L,
32452L)), .Names = c("GeoFIPS", "GeoName", "IndustryID", "Description",
"X2001", "X2002", "X2003", "X2004", "X2005"), class = "data.frame",
row.names = c(NA, -2L))
You could use melt/dcast
from reshape2
library(reshape2)
df2 <- melt(df1, id.var=c('GeoFIPS', 'GeoName',
'IndustryID', 'Description'))
df2 <- transform(df2, Year=sub('^X', '', variable))[-c(3,5)]
dcast(df2, ...~Description, value.var='value')
# GeoFIPS GeoName Year Mining Utilities
#1 10180 Abilene, TX 2001 96002 33588
#2 10180 Abilene, TX 2002 92407 34116
#3 10180 Abilene, TX 2003 127138 33105
#4 10180 Abilene, TX 2004 150449 33265
#5 10180 Abilene, TX 2005 202926 32452
df1 <- structure(list(GeoFIPS = c(10180L, 10180L), GeoName =
c("Abilene, TX",
"Abilene, TX"), IndustryID = 21:22, Description = c("Mining",
"Utilities"), X2001 = c(96002L, 33588L), X2002 = c(92407L, 34116L
), X2003 = c(127138L, 33105L), X2004 = c(150449L, 33265L), X2005 =
c(202926L,
32452L)), .Names = c("GeoFIPS", "GeoName", "IndustryID", "Description",
"X2001", "X2002", "X2003", "X2004", "X2005"), class = "data.frame",
row.names = c(NA, -2L))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.