[英]Wide to long format in R list: 3D to 2D array with 3rd dimension as ID
I have imported a *.mat data set of ECG data and it turns out to be an array nested in a list with (1:19, 1:2000, 1:45) dimensions. 我已经导入了* .mat的ECG数据集,结果证明它是嵌套在具有(1:19,1:2000,1:45)尺寸的列表中的数组。
I'd like to convert this array into a data.table in long format where each of the 1:45 are 'ids'. 我想将此数组转换为长格式,其中每个1:45为“ id”。 I like the look of reshape2 and tidyr but I don't see an easy way of doing it when a 'list' is involved.
我喜欢reshape2和tidyr的外观,但是当涉及到“列表”时,我看不到一种简单的方法。 Any thoughts?
有什么想法吗?
ADDED : Eg as the following picture: 添加 :例如下图:
EDIT : Added dput from ECGa 编辑 : 从ECGa添加了dput
dput(ECGa[1:4,1:4,1:4])
structure(c(0.266687798848186, 0.243782451327742, 0.256932437720159,
0.298861598151174, 0.198233672667731, 0.0917952258522064, 0.0911852809187542,
0.0896079263551856, 0.236398290801764, 0.0864552727199747, 0.0745517747485495,
0.141094205953345, 0.134887167694073, 0.0747942533151883, 0.0955856952160322,
0.0351423350784724, 0.0280172116375036, 0.0137183766752048, 0.00632054977574689,
0.0140727955279187, 0.0690137281047283, 0.078048395374513, 0.103558903741209,
0.0440585188615387, 0.156352265056089, 0.112594108595364, 0.162727838219577,
0.171253189308951, 0.10110879614821, 0.0815894300030362, 0.11782535820017,
0.0422632188213653, 0.0555849641766514, 0.0677027788598739, 0.0459698146330784,
0.0388415858274208, 0.0843241755529416, 0.0607574029475139, 0.0572549162201976,
0.0507991887467287, 0.0505785290171543, 0.064132492222132, 0.0527843866043094,
0.0354988312446934, 0.104654374350645, 0.0881949907935882, 0.0429712078085868,
0.0576943626267035, 0.0382280461459995, 0.124883693856915, 0.0481763535955804,
0.0397818749456581, 0.0782161984603273, 0.155594086108477, 0.121039425233015,
0.0563997196467123, 0.0513952066155024, 0.209997229543773, 0.0745673273804948,
0.0647872565452434, 0.0801540099609934, 0.147046389860838, 0.162708859129276,
0.0766361733056703), .Dim = c(4L, 4L, 4L), .Dimnames = list(NULL,
c("P7", "P4", "Cz", "Pz"), NULL))
I've tried doing: ECGa<-as.data.frame(ECGa) gives the right dimensions but it renames all the columns (eg the first becomes P7.1, P7.2 ... P7.45) I want to make a new column called ID that gives a value of 1 for the first patient and 2 for the second, up to 45 for the forty fifth. 我尝试做:ECGa <-as.data.frame(ECGa)给出正确的尺寸,但是它重命名了所有列(例如,第一个变成P7.1,P7.2 ... P7.45)一个新的ID列,第一位患者的值是1,第二位患者的值是2,第五十五位的值是45。
NEW ADDITION: I've found that using abind
does part of the job I want. 新添加:我发现使用
abind
可以完成我想要的部分工作。 But imagine I had a 1000 arrays, can I automate it? 但是想象一下我有1000个数组,我可以自动化吗? eg
例如
abind(ECGa[,,1],ECGa[,,2],ECGa[,,3],ECGa[,,4],ECGa[,,5],along=1)
> dim(abind(ECGa[,,1],ECGa[,,2],ECGa[,,3],ECGa[,,4],ECGa[,,5],along=1))
[1] 10000 19
Something like 就像是
dims <- dim(dd)
dd2 <- matrix(dd,nrow=prod(dims[2:3]),ncol=dims[1])
dd3 <- data.frame(ID=rep(1:dims[3],each=dims[2]),
dd2)
rownames(dd3) <- c("ID",dimnames(dd)[[2]])
should work, I think. 我认为应该可以工作。
I think you can do without abind
, perhaps as simple as: 我认为您可以不用
abind
,也许就abind
简单:
Reduce(rbind, sapply(1:dim(df)[3], function(i) {
x <- data.frame(df[,,i])
x$id <- i
x
}, simplify = FALSE))
# P7 P4 Cz Pz id
# 1 0.26668780 0.19823367 0.23639829 0.13488717 1
# 2 0.24378245 0.09179523 0.08645527 0.07479425 1
# 3 0.25693244 0.09118528 0.07455177 0.09558570 1
# 4 0.29886160 0.08960793 0.14109421 0.03514234 1
# 5 0.02801721 0.06901373 0.15635227 0.10110880 2
# 6 0.01371838 0.07804840 0.11259411 0.08158943 2
# 7 0.00632055 0.10355890 0.16272784 0.11782536 2
# 8 0.01407280 0.04405852 0.17125319 0.04226322 2
# 9 0.05558496 0.08432418 0.05057853 0.10465437 3
# 10 0.06770278 0.06075740 0.06413249 0.08819499 3
# 11 0.04596981 0.05725492 0.05278439 0.04297121 3
# 12 0.03884159 0.05079919 0.03549883 0.05769436 3
# 13 0.03822805 0.07821620 0.05139521 0.08015401 4
# 14 0.12488369 0.15559409 0.20999723 0.14704639 4
# 15 0.04817635 0.12103943 0.07456733 0.16270886 4
# 16 0.03978187 0.05639972 0.06478726 0.07663617 4
If by chance your third dimension actually has names (faked with your data using dimnames(df)[[3]] <- paste("id", 1:dim(df)[3], sep = "")
), then you can do: 如果您的第三个维度确实有名字(使用
dimnames(df)[[3]] <- paste("id", 1:dim(df)[3], sep = "")
伪造数据),则你可以做:
head(
Reduce(rbind, sapply(dimnames(df)[[3]], function(nm) {
x <- data.frame(df[,,nm])
x$id <- nm
x
}, simplify = FALSE))
)
# P7 P4 Cz Pz id
# 1 0.26668780 0.19823367 0.23639829 0.13488717 id1
# 2 0.24378245 0.09179523 0.08645527 0.07479425 id1
# 3 0.25693244 0.09118528 0.07455177 0.09558570 id1
# 4 0.29886160 0.08960793 0.14109421 0.03514234 id1
# 5 0.02801721 0.06901373 0.15635227 0.10110880 id2
# 6 0.01371838 0.07804840 0.11259411 0.08158943 id2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.