简体   繁体   中英

Error in cor.test.default(x = mat[, i], y = mat[, j], …) : not enough finite observations

I have looked on Google and on StackOverflow to find a solution to my problem. I have tried a few things now, and nothing seems to be working.

I am trying to create a correlation boxplot of linguistic features. For each feature (36 in total), there is a 1 in Excel for when a speaker used it, and a 0 for when a speaker did not.

There are 41 speakers, none of whom used all 36 features, though the lowest score is 8. I want to analyse my data to see which features correlate, and therefore find out which features predict the use of other features.

I have been using corrplot in R. Here is the command I have been using:

cor_mat <- df_analysis %>%
    replace(., is.na(.), 0) %>%
    cor(method = "spearman")

cor_residuals <- cor.mtest(cor_mat, conf.level = .95)

But, I get an error saying:

Error in cor.test.default(x = mat[, i], y = mat[, j], ...): not enough finite observations

Does anybody know why and how I can rectify it? In fact, all I really need to know is what the problem is, and I can probably figure it out on my own from there. Though I would be hugely grateful if you also have the solution!

Many thanks!

You can define corrplot ideally like this:

df_cor <- cor(df_analysis)
corrplot(df_cor, type = "full", order = "hclust",
         outline.color = "white", hc.method = "ward",
         pch.cex = .5, show.diag = TRUE,
         p.mat = cor_residuals$p, insig = "blank", sig.level = .01,
         addrect = 20, tl.srt = 36, tl.cex = .8, tl.col = "black",
         col = rev(lacroix_palette("PassionFruit", 8, "continuous")))

You have several columns in your data set that have no variation; thus the correlations for these variables are all NA , which screws things up downstream.

which(apply(df_analysis,2,sd)==0)
## [1] a' c[h]lach bheag [3] a' c[h]loich bhig [14] a' b[h]ord bheag 
##                     1                     3                    14 
##       [26] nan su[ ]l       [27] nan sul[ ] 
##                    26                    27 

I figured this out by setting options(error=recover) and running to see where the error occurred (this setting drops you into browser/debug mode when an error occurs). More directly, I should have done corrplot(cor_mat) , which helpfully puts question marks for NA values...

在此处输入图像描述

image() , or heatmap(as.matrix(df_analysis),Rowv=NA,Colv=NA, scale="none", margins=c(10,8)) , would be good for looking at your raw data.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM