简体   繁体   English

R列表中的宽到长格式:3D到2D数组,第3维为ID

[英]Wide to long format in R list: 3D to 2D array with 3rd dimension as ID

I have imported a *.mat data set of ECG data and it turns out to be an array nested in a list with (1:19, 1:2000, 1:45) dimensions. 我已经导入了* .mat的ECG数据集,结果证明它是嵌套在具有(1:19,1:2000,1:45)尺寸的列表中的数组。

I'd like to convert this array into a data.table in long format where each of the 1:45 are 'ids'. 我想将此数组转换为长格式,其中每个1:45为“ id”。 I like the look of reshape2 and tidyr but I don't see an easy way of doing it when a 'list' is involved. 我喜欢reshape2和tidyr的外观,但是当涉及到“列表”时,我看不到一种简单的方法。 Any thoughts? 有什么想法吗?

ADDED : Eg as the following picture: 添加 :例如下图: 在此处输入图片说明

EDIT : Added dput from ECGa 编辑从ECGa添加了dput

    dput(ECGa[1:4,1:4,1:4])
structure(c(0.266687798848186, 0.243782451327742, 0.256932437720159, 
0.298861598151174, 0.198233672667731, 0.0917952258522064, 0.0911852809187542, 
0.0896079263551856, 0.236398290801764, 0.0864552727199747, 0.0745517747485495, 
0.141094205953345, 0.134887167694073, 0.0747942533151883, 0.0955856952160322, 
0.0351423350784724, 0.0280172116375036, 0.0137183766752048, 0.00632054977574689, 
0.0140727955279187, 0.0690137281047283, 0.078048395374513, 0.103558903741209, 
0.0440585188615387, 0.156352265056089, 0.112594108595364, 0.162727838219577, 
0.171253189308951, 0.10110879614821, 0.0815894300030362, 0.11782535820017, 
0.0422632188213653, 0.0555849641766514, 0.0677027788598739, 0.0459698146330784, 
0.0388415858274208, 0.0843241755529416, 0.0607574029475139, 0.0572549162201976, 
0.0507991887467287, 0.0505785290171543, 0.064132492222132, 0.0527843866043094, 
0.0354988312446934, 0.104654374350645, 0.0881949907935882, 0.0429712078085868, 
0.0576943626267035, 0.0382280461459995, 0.124883693856915, 0.0481763535955804, 
0.0397818749456581, 0.0782161984603273, 0.155594086108477, 0.121039425233015, 
0.0563997196467123, 0.0513952066155024, 0.209997229543773, 0.0745673273804948, 
0.0647872565452434, 0.0801540099609934, 0.147046389860838, 0.162708859129276, 
0.0766361733056703), .Dim = c(4L, 4L, 4L), .Dimnames = list(NULL, 
    c("P7", "P4", "Cz", "Pz"), NULL))

I've tried doing: ECGa<-as.data.frame(ECGa) gives the right dimensions but it renames all the columns (eg the first becomes P7.1, P7.2 ... P7.45) I want to make a new column called ID that gives a value of 1 for the first patient and 2 for the second, up to 45 for the forty fifth. 我尝试做:ECGa <-as.data.frame(ECGa)给出正确的尺寸,但是它重命名了所有列(例如,第一个变成P7.1,P7.2 ... P7.45)一个新的ID列,第一位患者的值是1,第二位患者的值是2,第五十五位的值是45。

NEW ADDITION: I've found that using abind does part of the job I want. 新添加:我发现使用abind可以完成我想要的部分工作。 But imagine I had a 1000 arrays, can I automate it? 但是想象一下我有1000个数组,我可以自动化吗? eg 例如

 abind(ECGa[,,1],ECGa[,,2],ECGa[,,3],ECGa[,,4],ECGa[,,5],along=1)
> dim(abind(ECGa[,,1],ECGa[,,2],ECGa[,,3],ECGa[,,4],ECGa[,,5],along=1))
[1] 10000    19

Something like 就像是

dims <- dim(dd)
dd2 <- matrix(dd,nrow=prod(dims[2:3]),ncol=dims[1])
dd3 <- data.frame(ID=rep(1:dims[3],each=dims[2]),
                  dd2)
rownames(dd3) <- c("ID",dimnames(dd)[[2]])

should work, I think. 我认为应该可以工作。

I think you can do without abind , perhaps as simple as: 我认为您可以不用abind ,也许就abind简单:

Reduce(rbind, sapply(1:dim(df)[3], function(i) {
  x <- data.frame(df[,,i])
  x$id <- i
  x
}, simplify = FALSE))
#            P7         P4         Cz         Pz id
# 1  0.26668780 0.19823367 0.23639829 0.13488717  1
# 2  0.24378245 0.09179523 0.08645527 0.07479425  1
# 3  0.25693244 0.09118528 0.07455177 0.09558570  1
# 4  0.29886160 0.08960793 0.14109421 0.03514234  1
# 5  0.02801721 0.06901373 0.15635227 0.10110880  2
# 6  0.01371838 0.07804840 0.11259411 0.08158943  2
# 7  0.00632055 0.10355890 0.16272784 0.11782536  2
# 8  0.01407280 0.04405852 0.17125319 0.04226322  2
# 9  0.05558496 0.08432418 0.05057853 0.10465437  3
# 10 0.06770278 0.06075740 0.06413249 0.08819499  3
# 11 0.04596981 0.05725492 0.05278439 0.04297121  3
# 12 0.03884159 0.05079919 0.03549883 0.05769436  3
# 13 0.03822805 0.07821620 0.05139521 0.08015401  4
# 14 0.12488369 0.15559409 0.20999723 0.14704639  4
# 15 0.04817635 0.12103943 0.07456733 0.16270886  4
# 16 0.03978187 0.05639972 0.06478726 0.07663617  4

If by chance your third dimension actually has names (faked with your data using dimnames(df)[[3]] <- paste("id", 1:dim(df)[3], sep = "") ), then you can do: 如果您的第三个维度确实有名字(使用dimnames(df)[[3]] <- paste("id", 1:dim(df)[3], sep = "")伪造数据),则你可以做:

head(
  Reduce(rbind, sapply(dimnames(df)[[3]], function(nm) {
    x <- data.frame(df[,,nm])
    x$id <- nm
    x
  }, simplify = FALSE))
)
#            P7         P4         Cz         Pz  id
# 1  0.26668780 0.19823367 0.23639829 0.13488717 id1
# 2  0.24378245 0.09179523 0.08645527 0.07479425 id1
# 3  0.25693244 0.09118528 0.07455177 0.09558570 id1
# 4  0.29886160 0.08960793 0.14109421 0.03514234 id1
# 5  0.02801721 0.06901373 0.15635227 0.10110880 id2
# 6  0.01371838 0.07804840 0.11259411 0.08158943 id2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM