I'd like to reshape a data frame from long to wide (dummy data given below). My real data frame has many (40+) numeric variables, but I'd like to select only three of the latter when reshaping.
Options I have found so far include:
(1) reshape one variable using dcast()
:
library(reshape2)
dcast(d, group1 + group2 ~ location, value.var = "mass")
(2) reshape all the variables using reshape()
:
reshape(d, idvar = c("group1", "group2"), timevar = "location", sep = ".",
direction = "wide")
(3) create a vector of variables I'd like to exclude called variables.to.drop
and pass this to reshape()
:
variables.to.drop <- c("diameter", "volume")
reshape(d, idvar = c("group1", "group2"), timevar = "location", sep = ".",
direction = "wide", drop = variables.to.drop)
but I haven't found a function that takes a vector or list of variables to reshape. Basically a version of dcast()
that allows a list or vector to be passed to the argument value.var
would fit the bill. Is there a function like this that I haven't come across?
(I realize I can subset the data before doing the reshape operation, but it would be much cleaner to simply specify the variables in the reshaping function - since using subset()
would involve having to specify all ID variables to be included, whereas IDs are recognized automatically in most reshaping functions. I could also just use the drop
argument in reshape()
, as above, but I want to select 3 variables out of 40+, so this is cumbersome.)
d <- structure(list(group1 = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("ripe", "unripe"), class = "factor"),
group2 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 5L, 5L, 5L,
5L, 5L, 5L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L,
4L, 4L, 4L, 4L, 4L, 4L), .Label = c("apple", "grapefruit",
"orange", "peach", "pear"), class = "factor"), type = structure(c(2L,
2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L,
1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L), .Label = c("large",
"small"), class = "factor"), location = structure(c(1L, 2L,
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), .Label = c("P1",
"P2", "P3"), class = "factor"), diameter = c(17.2, 19.1,
18.5, 23.3, 22.9, 19.4, 11.1, 11.8, 6.8, 3.2, 7.9, 5.6, 8.4,
9.2, 9.7, 17.1, 19.4, 18.9, 11.8, 10.6, 10.1, 18.8, 17.9,
13.2, 8.5, 8.9, 7.2, 10.1, 8.7, 6.6), mass = c(11.1370341130532,
16.2229940481484, 16.0927473288029, 16.2337944167666, 18.6091538355686,
16.4031060528941, 10.0949575635605, 12.3255050601438, 16.6608375823125,
15.1425114134327, 16.9359129178338, 15.4497483558953, 12.8273358359002,
19.2343348427676, 12.9231584025547, 18.3729562815279, 12.8622328466736,
12.6682078000158, 11.8672278965823, 12.3222591052763, 13.1661245482974,
13.0269337072968, 11.590460028965, 10.3999591805041, 12.1879954100586,
18.1059855245985, 15.2569754677825, 19.1465816600248, 18.3134504687041,
10.4577026329935), volume = c(39.1218296485022, 35.3037334373221,
36.0934440605342, 40.1461374014616, 33.6219241656363, 45.1934127090499,
34.0249607525766, 35.1761963730678, 49.8430083505809, 46.1470468062907,
41.0666718147695, 42.9281218815595, 36.2364861415699, 42.4363839626312,
36.5954035148025, 40.0399494590238, 43.5418905457482, 39.6998247830197,
34.8785765469074, 45.3091957513243, 31.4755976013839, 36.193732037209,
44.3454348668456, 40.0909182429314, 33.0599791789427, 40.0786697631702,
39.879218460992, 45.0240039406344, 33.4929964784533, 46.9678482087329
)), .Names = c("group1", "group2", "type", "location", "diameter",
"mass", "volume"), row.names = c(NA, -30L), class = "data.frame")
You have resisted adding details needed to allow a specific answer (names of columns in question and example or explicit structure of desired output), but perhaps this is what you are asking to be done. If you want to drop a specific set of columns from a dataframe prior to reshaping it is reasonably straightforward:
colNamesToBeDropped <- c("colnam1","colnam2","colnam3","colnam4","colnam5")
colsToBekept <- ! names(dfrm) %in% colNamesToBeDropped
reshape( dfrm[ , colsToBeKept] , .... rest of parameters ... )
#Using a dataset that is on everyone's machine
> state.x77 <- as.data.frame(state.x77)
> str(state.x77)
'data.frame': 50 obs. of 8 variables:
$ Population: num 3615 365 2212 2110 21198 ...
$ Income : num 3624 6315 4530 3378 5114 ...
$ Illiteracy: num 2.1 1.5 1.8 1.9 1.1 0.7 1.1 0.9 1.3 2 ...
$ Life Exp : num 69 69.3 70.5 70.7 71.7 ...
$ Murder : num 15.1 11.3 7.8 10.1 10.3 6.8 3.1 6.2 10.7 13.9 ...
$ HS Grad : num 41.3 66.7 58.1 39.9 62.6 63.9 56 54.6 52.6 40.6 ...
$ Frost : num 20 152 15 65 20 166 139 103 11 60 ...
$ Area : num 50708 566432 113417 51945 156361 ...
> colNamesToBeDropped <- c("Income", "Murder")
> colsToBeKept <- ! names(state.x77) %in% colNamesToBeDropped
> str( state.x77[ , colsToBeKept] )
'data.frame': 50 obs. of 6 variables:
$ Population: num 3615 365 2212 2110 21198 ...
$ Illiteracy: num 2.1 1.5 1.8 1.9 1.1 0.7 1.1 0.9 1.3 2 ...
$ Life Exp : num 69 69.3 70.5 70.7 71.7 ...
$ HS Grad : num 41.3 66.7 58.1 39.9 62.6 63.9 56 54.6 52.6 40.6 ...
$ Frost : num 20 152 15 65 20 166 139 103 11 60 ...
$ Area : num 50708 566432 113417 51945 156361 ...
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.