This is what I have tried so far. It works, but it only tells me the p.value of the data that has no NA's. Much of my data has NA values in a few places up to 1/3rd of the data.
normal <- apply(cor_phys, 2, function(x) shapiro.test(x)$p.value)
I want to try adding na.rm
to the function, but it's not working. Help?
#calculate the correlations between all variables
corres <- cor_phys %>% #cor_phys is my data
as.matrix %>%
cor(use="complete.obs") %>% #complete.obs does not use NA
as.data.frame %>%
rownames_to_column(var = 'var1') %>%
gather(var2, value, -var1)
#removes duplicates correlations
corres <- corres %>%
mutate(var_order = paste(var1, var2) %>%
strsplit(split = ' ') %>%
map_chr( ~ sort(.x) %>%
paste(collapse = ' '))) %>%
mutate(cnt = 1) %>%
group_by(var_order) %>%
mutate(cumsum = cumsum(cnt)) %>%
filter(cumsum != 2) %>%
ungroup %>%
select(-var_order, -cnt, -cumsum) #removes unneeded columns
I did not write this myself, but it is the answer that I used and worked for my needs. The link to the page I used is: How to compute correlations between all columns in R and detect highly correlated variables
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.