简体   繁体   English

如何计算数据框列表中某些列之间的相关性?

[英]How to calculate correlations between certain columns in a list of dataframes?

I need to generate 20 different samples (n=100) from standard normal distribution (X0, X1, X2, ..., X19) and calculate the correlation between X0 and all the other samples X2...X19.我需要从标准正态分布(X0、X1、X2、...、X19)中生成 20 个不同的样本(n=100),并计算 X0 与所有其他样本 X2...X19 之间的相关性。 I know how to do this for one "whole sample" (X0...X19) but I should do this for several samples of X0...X19 simultaneously.我知道如何对一个“整个样本”(X0...X19)执行此操作,但我应该同时对多个 X0...X19 样本执行此操作。 I tried generating a list of dataframes (each dataframe containing one sample of X0...X19) and iterate through it but it failed for some reason.我尝试生成一个数据帧列表(每个 dataframe 包含一个 X0...X19 样本)并遍历它,但由于某种原因它失败了。

My data looks like this:我的数据如下所示:

dataframes <- replicate(10, as.data.frame(replicate(20, rnorm(100))))

head(dataframes)

#   [,1]        [,2]        [,3]        [,4]        [,5]        [,6]       
#V1 Numeric,100 Numeric,100 Numeric,100 Numeric,100 Numeric,100 Numeric,100
#V2 Numeric,100 Numeric,100 Numeric,100 Numeric,100 Numeric,100 Numeric,100
#V3 Numeric,100 Numeric,100 Numeric,100 Numeric,100 Numeric,100 Numeric,100
#V4 Numeric,100 Numeric,100 Numeric,100 Numeric,100 Numeric,100 Numeric,100
#V5 Numeric,100 Numeric,100 Numeric,100 Numeric,100 Numeric,100 Numeric,100
#V6 Numeric,100 Numeric,100 Numeric,100 Numeric,100 Numeric,100 Numeric,100
#   [,7]        [,8]        [,9]        [,10]      
#V1 Numeric,100 Numeric,100 Numeric,100 Numeric,100
#V2 Numeric,100 Numeric,100 Numeric,100 Numeric,100
#V3 Numeric,100 Numeric,100 Numeric,100 Numeric,100
#V4 Numeric,100 Numeric,100 Numeric,100 Numeric,100
#V5 Numeric,100 Numeric,100 Numeric,100 Numeric,100
#V6 Numeric,100 Numeric,100 Numeric,100 Numeric,100

I tried calculating the correlations like this:我尝试计算这样的相关性:

lapply(frames,
       function(x){
                   cor(x[,1]$V1, x[-c(1:1)])
                   return(x)
                   }
                    ) 

But this resulted in an error:但这导致了一个错误:

Error in x[, 1]: incorrect number of dimensions x[, 1] 中的错误:维数不正确

I'm not very familiar with lapply or loops in general, so I could really use some help.一般来说,我对 lapply 或 loops 不是很熟悉,所以我真的可以使用一些帮助。

Your reproducible example is not reproducible.您的可重现示例不可重现。 One problem is your data is not a data.frame or list一个问题是您的数据不是data.framelist

class(dataframes) 
[1] "matrix" "array"

In addition there are a few easy mistakes like returning x in lapply ( x is the input here and not the result) and double subsetting x .此外还有一些简单的错误,例如在lapply中返回xx是这里的输入而不是结果)和双子集x Fixing these minor mistakes fixes your problem修复这些小错误可以解决您的问题

dataframes <- replicate(10, 
                        as.data.frame(replicate(20, rnorm(100)))
                        simplify = FALSE) # <=== fix
lapply(dataframes, # <=== name corrected
       function(x){
                   cor(x$V1, x[-1]) # no need to subset x before `$V1`
                   # return(x) # <== Remove return x
         }
       ) 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 计算 data.frame 列之间的相关性并将 output 分配给列表 - Calculate correlations between data.frame columns and assign output to list 如何计算rollapply中几列和一列之间的滚动相关性? - How to calculate rolling correlations between several columns and one column in rollapply? R中数据框和数据框列表之间的关联 - Correlations between dataframe and list of dataframes in R 列表对象内多个数据框之间的相关性 - Correlations between several dataframes within a list object for循环可在2个不同的数据框中查找相同变量(列)之间的相关性 - for loop to find correlations between same variables (columns) in 2 different dataframes 我如何计算两个矩阵的对应列之间的相关性,而不是像 output 那样获得其他相关性 - how do i calculate correlation between corresponding columns of two matrices and not getting other correlations as output 计算 2 个数据帧中变量之间的相关性 - Computing correlations between variables in 2 dataframes R中的factanal()函数如何计算因素之间的相关性? - How does factanal() function in R calculate correlations between factors? R数据框列表-如何在每个条目中选择某些列? - R List of Dataframes - How to select certain columns in every entry? 如何在数据框列表中删除符合特定模式的列 - How to drop columns that meet a certain pattern over a list of dataframes
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM