[英]R function to reshape data frame while selecting multiple variables?
我想從長到寬重塑數據框(下面給出了虛擬數據)。 我的真實數據框有許多(40+)個數字變量,但是在重塑時我只想選擇后者中的三個。
到目前為止,我發現的選項包括:
(1)使用dcast()
重塑一個變量:
library(reshape2)
dcast(d, group1 + group2 ~ location, value.var = "mass")
(2)使用reshape()
重塑所有變量:
reshape(d, idvar = c("group1", "group2"), timevar = "location", sep = ".",
direction = "wide")
(3)創建一個我想排除的變量向量,稱為variables.to.drop
並將其傳遞給reshape()
:
variables.to.drop <- c("diameter", "volume")
reshape(d, idvar = c("group1", "group2"), timevar = "location", sep = ".",
direction = "wide", drop = variables.to.drop)
但我還沒有找到需要向量或變量列表進行重塑的函數。 基本上是dcast()
的版本,該版本允許將列表或向量傳遞到參數value.var
將符合要求。 有沒有我沒有遇到過的這樣的功能?
(我意識到我可以在進行整形操作之前對數據進行子集化,但是在整形函數中簡單地指定變量會更清潔-因為使用subset()
涉及到必須指定要包含的所有ID變量,而ID是可以在大多數重塑函數中自動識別。如上所述,我也可以在reshape()
使用drop
參數,但是我想從40個以上的變量中選擇3個變量,因此很麻煩。)
d <- structure(list(group1 = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("ripe", "unripe"), class = "factor"),
group2 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 5L, 5L, 5L,
5L, 5L, 5L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L,
4L, 4L, 4L, 4L, 4L, 4L), .Label = c("apple", "grapefruit",
"orange", "peach", "pear"), class = "factor"), type = structure(c(2L,
2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L,
1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L), .Label = c("large",
"small"), class = "factor"), location = structure(c(1L, 2L,
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), .Label = c("P1",
"P2", "P3"), class = "factor"), diameter = c(17.2, 19.1,
18.5, 23.3, 22.9, 19.4, 11.1, 11.8, 6.8, 3.2, 7.9, 5.6, 8.4,
9.2, 9.7, 17.1, 19.4, 18.9, 11.8, 10.6, 10.1, 18.8, 17.9,
13.2, 8.5, 8.9, 7.2, 10.1, 8.7, 6.6), mass = c(11.1370341130532,
16.2229940481484, 16.0927473288029, 16.2337944167666, 18.6091538355686,
16.4031060528941, 10.0949575635605, 12.3255050601438, 16.6608375823125,
15.1425114134327, 16.9359129178338, 15.4497483558953, 12.8273358359002,
19.2343348427676, 12.9231584025547, 18.3729562815279, 12.8622328466736,
12.6682078000158, 11.8672278965823, 12.3222591052763, 13.1661245482974,
13.0269337072968, 11.590460028965, 10.3999591805041, 12.1879954100586,
18.1059855245985, 15.2569754677825, 19.1465816600248, 18.3134504687041,
10.4577026329935), volume = c(39.1218296485022, 35.3037334373221,
36.0934440605342, 40.1461374014616, 33.6219241656363, 45.1934127090499,
34.0249607525766, 35.1761963730678, 49.8430083505809, 46.1470468062907,
41.0666718147695, 42.9281218815595, 36.2364861415699, 42.4363839626312,
36.5954035148025, 40.0399494590238, 43.5418905457482, 39.6998247830197,
34.8785765469074, 45.3091957513243, 31.4755976013839, 36.193732037209,
44.3454348668456, 40.0909182429314, 33.0599791789427, 40.0786697631702,
39.879218460992, 45.0240039406344, 33.4929964784533, 46.9678482087329
)), .Names = c("group1", "group2", "type", "location", "diameter",
"mass", "volume"), row.names = c(NA, -30L), class = "data.frame")
您拒絕添加允許特定答案所需的詳細信息(所涉及列的名稱和示例或所需輸出的顯式結構),但這也許就是您要完成的工作。 如果要在重塑之前從數據框中刪除一組特定的列,這是相當簡單的:
colNamesToBeDropped <- c("colnam1","colnam2","colnam3","colnam4","colnam5")
colsToBekept <- ! names(dfrm) %in% colNamesToBeDropped
reshape( dfrm[ , colsToBeKept] , .... rest of parameters ... )
#Using a dataset that is on everyone's machine
> state.x77 <- as.data.frame(state.x77)
> str(state.x77)
'data.frame': 50 obs. of 8 variables:
$ Population: num 3615 365 2212 2110 21198 ...
$ Income : num 3624 6315 4530 3378 5114 ...
$ Illiteracy: num 2.1 1.5 1.8 1.9 1.1 0.7 1.1 0.9 1.3 2 ...
$ Life Exp : num 69 69.3 70.5 70.7 71.7 ...
$ Murder : num 15.1 11.3 7.8 10.1 10.3 6.8 3.1 6.2 10.7 13.9 ...
$ HS Grad : num 41.3 66.7 58.1 39.9 62.6 63.9 56 54.6 52.6 40.6 ...
$ Frost : num 20 152 15 65 20 166 139 103 11 60 ...
$ Area : num 50708 566432 113417 51945 156361 ...
> colNamesToBeDropped <- c("Income", "Murder")
> colsToBeKept <- ! names(state.x77) %in% colNamesToBeDropped
> str( state.x77[ , colsToBeKept] )
'data.frame': 50 obs. of 6 variables:
$ Population: num 3615 365 2212 2110 21198 ...
$ Illiteracy: num 2.1 1.5 1.8 1.9 1.1 0.7 1.1 0.9 1.3 2 ...
$ Life Exp : num 69 69.3 70.5 70.7 71.7 ...
$ HS Grad : num 41.3 66.7 58.1 39.9 62.6 63.9 56 54.6 52.6 40.6 ...
$ Frost : num 20 152 15 65 20 166 139 103 11 60 ...
$ Area : num 50708 566432 113417 51945 156361 ...
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.