简体   繁体   English

ff:通过一个ffapply函数调用返回多个数组

[英]ff: returning multiple arrays with a single ffapply function call

I am dealing with a large dataset of 3D imaging data that I have loaded in to R using ff() . 我正在处理使用ff()加载到R中的3D成像数据的大型数据集。

require(ff)

nSubj <- 125
vol_dim <- c(139,137,87)
ff_qmap <- ff(0, dim=c(vol_dim,nSubj)

Simple calls like getting an average array/"volume" back work fine: 简单的调用(例如获取平均数组/“音量”)可以正常工作:

mean_qmap_vol <- ffapply(X=ff_qmap,MARGIN=c(1,2,3),AFUN=mean,RETURN=TRUE)

However, in some instances I would like to return more than one array/"volume" back in a single ffapply call; 但是,在某些情况下,我想在单个ffapply调用中返回多个数组/“卷”。 for instance, when performing some basic regression eg against age: 例如,在进行一些基本的回归分析(例如针对年龄)时:

pval_vol <- ffapply( AFUN=f <- function(x) {
                  df$voxel <- x
                    fe1 <- lm(formula = voxel ~ age, df)
                    summary_fe1 <- summary(fe1)
                fe1_estimate <- summary_fe1$coefficients[2,1]
                fe1_pval <- summary_fe1$coefficients[2,4]
                return(fe1_pval)
}, X = ff_qmap, MARGIN = c(1,2,3), RETURN = TRUE)

This works for returning a single volume back, ie fe1_pval . 这适用于返回单个卷,即fe1_pval

Is there a way to return both the fe1_estimate and fe1_pval (and perhaps more estimates) in one ffapply call? 有没有一种方法可以在一次ffapply调用中同时返回fe1_estimatefe1_pval (也许还有更多估计)?

> sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.2 LTS   
...
other attached packages:
[1] ff_2.2-13        bit_1.1-12       lme4_1.1-17      Matrix_1.2-8     ggplot2_2.2.1    fslr_2.12        neurobase_1.13.2
[8] oro.nifti_0.9.1 

I tried a number of solutions including returning a combined vector with c() and alternatively a list. 我尝试了多种解决方案,包括使用c()和列表返回组合向量。 However, I could not find a solution involving the ffapply routine that would work. 但是,我找不到涉及有效的ffapply例程的解决方案。 Some key references I looked at are here: 我看过的一些关键参考资料在这里:

I found a stop-gap solution taking a classic for loop approach and cycling through the 3D dataset. 我找到了一种采用经典for循环方法并在3D数据集之间循环的权宜之计。 Because the size of my array is not prohibitively large in this case, it works. 因为在这种情况下,数组的大小不会过大,所以可以使用。 I would ultimately prefer a solution using ffapply() so that it is extensible for higher resolution and larger datasets; 我最终会喜欢使用ffapply()的解决方案,以便可以扩展为更高分辨率和更大的数据集。 and has the potential for parallelization. 并具有并行化的潜力。 Open to suggestions! 欢迎提出建议!

The coef() stats function turned out to be a great way of extracting all the model coefficients in a standard way. 事实证明, coef() stats函数是一种以标准方式提取所有模型系数的好方法。

testlist <- vector(mode="list", length=vol_dim[1]*vol_dim[2]*vol_dim[3])
i <- 1
for (x in 1:vol_dim[1]) {
  for (y in 1:vol_dim[2]) {
    for (z in 1:vol_dim[3]) {
      df$voxel <- ff_logjac[x,y,z,]
      fe1 <- lm(formula = voxel ~ age, df)
      testlist[[i]] <- coef(summary(fe1))
      i <- i + 1
    }
  }
}

This is how the list() of lm coefficients is accessed after: 这是在以下情况下访问lm系数list()的方式:

> length(testlist)
[1] 1656741
> vol_dim[1]*vol_dim[2]*vol_dim[3]
[1] 1656741
> testlist[[1]]
                Estimate  Std. Error    t value  Pr(>|t|)
(Intercept)  0.061286603 0.168853045  0.3629582 0.7191810
age         -0.002272307 0.003510186 -0.6473466 0.5223308
> testlist[[1656741]]
                Estimate  Std. Error   t value   Pr(>|t|)
(Intercept) -0.444810783 0.192135240 -2.315092 0.02763245
age          0.007246639 0.003994186  1.814297 0.07964480
> testlist[[1]][1,1]
[1] 0.0612866

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM