简体   繁体   English

如何在散点图中显示相关系数?

[英]How do I display a correlation coefficient in a scatterplot?

In a scatterplot, I would like to display both the correlation coefficient along an equation describing the relationship between x and y.在散点图中,我想沿描述 x 和 y 之间关系的方程显示相关系数。 I have created my datamaterial, here is my code so far:我已经创建了我的数据材料,到目前为止,这是我的代码:

library(tidyverse)

# Creation of datamaterial

salary <- c(95, 100, 105, 110, 120, 124, 135, 150, 165, 175, 225, 230, 235, 260)
height <- c(160, 150, 182, 165, 172, 175, 183, 187, 174, 193, 201, 172, 180, 188)
fakenumbers <- data.frame(salary, height)

cor(height, salary, method = c("pearson"))

# Creation of scatterplot

r <- ggplot(fakenumbers, aes(x = height, y = salary)) + 
  geom_point(size = 3, shape = 21, color = "black", fill = "blue") + 
  labs(y = "Hourly salary 
       (sek)", x = "height (cm)", title = "Relationship between height and salary (made up data)") + 
  theme_classic() + theme(plot.title = element_text(hjust = 0.5, size = 18), 
                          axis.title = element_text(size = 15), 
                          axis.title.y = element_text(angle = 0, vjust = 0.5), 
                          axis.text = element_text(size = 11))

# Adding a regressionline

r + geom_smooth(method = lm, formula = y ~ x, se = FALSE)

Inside of the coordinate system, next to the regressionline, I would like an "r = 0.588" displayed and some equation describing the linear relationship.在坐标系内部,在回归线旁边,我想显示一个“r = 0.588”和一些描述线性关系的方程。 How can I accomplish this, using preferably ggplot(), or some other function?我怎样才能做到这一点,最好使用ggplot()或其他一些function?

We could do it with ggpubr package, adding stat_cor(p.accuracy = 0.001, r.accuracy = 0.01) to your code:我们可以使用ggpubr package 来做到这一点,将stat_cor(p.accuracy = 0.001, r.accuracy = 0.01)添加到您的代码中:

library(ggpubr)
library(tidyverse)

r <- ggplot(fakenumbers, aes(x = height, y = salary)) + 
  geom_point(size = 3, shape = 21, color = "black", fill = "blue") + 
  stat_cor(p.accuracy = 0.001, r.accuracy = 0.01)+
  labs(y = "Hourly salary 
       (sek)", x = "height (cm)", title = "Relationship between height and salary (made up data)") + 
  theme_classic() + theme(plot.title = element_text(hjust = 0.5, size = 18), 
                          axis.title = element_text(size = 15), 
                          axis.title.y = element_text(angle = 0, vjust = 0.5), 
                          axis.text = element_text(size = 11))

在此处输入图像描述

Here a base R way.这里有一个基本的 R 方式。 Define a formula fo , calculate regression, and define an eq ation.定义公式fo ,计算回归,并定义eq

corr <- cor(height, salary, method = c("pearson"))

fo <- salary ~ height
fit <- lm(fo, fakenumbers)
(eq <- paste0(all.vars(fo)[1], ' ~ ', paste0(round(coef(fit), 2),
              gsub('\\*\\(Intercept\\)', '', 
                   paste0('*', names(coef(fit)))), collapse=' + ')))
# [1] "salary ~ -281.58 + salary ~ 2.49*height"

Then use variables in plot() , abline() , and text() .然后在plot()abline()text()中使用变量。

plot(fo, fakenumbers, pch=20, col=4,
     xlab='height (cm)', ylab='Hourly salar (sec)',
     main='Relationship between height and salary (made up data)')
abline(fit, col=4)
text(149, 250, bquote(italic('r=')~.(round(corr, 3))), adj=0, cex=.8)
text(149, 235, eq, adj=0, cex=.8)

在此处输入图像描述


Data:数据:

fakenumbers <- structure(list(salary = c(95, 100, 105, 110, 120, 124, 135, 150, 
165, 175, 225, 230, 235, 260), height = c(160, 150, 182, 165, 
172, 175, 183, 187, 174, 193, 201, 172, 180, 188)), class = "data.frame", row.names = c(NA, 
-14L))

Another way:另一种方式:

round(cor(height, salary, method = c("pearson")), 4) -> corr

and then using geom_text to display the correlation coefficient:然后使用geom_text显示相关系数:

r +
  geom_smooth(method = lm, formula = y ~ x, se = FALSE) +
  geom_text(x = 152, y = 250,
            label = paste0('r = ', corr),
            color = 'red')

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM