繁体   English   中英

如何在散点图中显示相关系数?

[英]How do I display a correlation coefficient in a scatterplot?

在散点图中,我想沿描述 x 和 y 之间关系的方程显示相关系数。 我已经创建了我的数据材料,到目前为止,这是我的代码:

library(tidyverse)

# Creation of datamaterial

salary <- c(95, 100, 105, 110, 120, 124, 135, 150, 165, 175, 225, 230, 235, 260)
height <- c(160, 150, 182, 165, 172, 175, 183, 187, 174, 193, 201, 172, 180, 188)
fakenumbers <- data.frame(salary, height)

cor(height, salary, method = c("pearson"))

# Creation of scatterplot

r <- ggplot(fakenumbers, aes(x = height, y = salary)) + 
  geom_point(size = 3, shape = 21, color = "black", fill = "blue") + 
  labs(y = "Hourly salary 
       (sek)", x = "height (cm)", title = "Relationship between height and salary (made up data)") + 
  theme_classic() + theme(plot.title = element_text(hjust = 0.5, size = 18), 
                          axis.title = element_text(size = 15), 
                          axis.title.y = element_text(angle = 0, vjust = 0.5), 
                          axis.text = element_text(size = 11))

# Adding a regressionline

r + geom_smooth(method = lm, formula = y ~ x, se = FALSE)

在坐标系内部,在回归线旁边,我想显示一个“r = 0.588”和一些描述线性关系的方程。 我怎样才能做到这一点,最好使用ggplot()或其他一些function?

我们可以使用ggpubr package 来做到这一点,将stat_cor(p.accuracy = 0.001, r.accuracy = 0.01)添加到您的代码中:

library(ggpubr)
library(tidyverse)

r <- ggplot(fakenumbers, aes(x = height, y = salary)) + 
  geom_point(size = 3, shape = 21, color = "black", fill = "blue") + 
  stat_cor(p.accuracy = 0.001, r.accuracy = 0.01)+
  labs(y = "Hourly salary 
       (sek)", x = "height (cm)", title = "Relationship between height and salary (made up data)") + 
  theme_classic() + theme(plot.title = element_text(hjust = 0.5, size = 18), 
                          axis.title = element_text(size = 15), 
                          axis.title.y = element_text(angle = 0, vjust = 0.5), 
                          axis.text = element_text(size = 11))

在此处输入图像描述

这里有一个基本的 R 方式。 定义公式fo ,计算回归,并定义eq

corr <- cor(height, salary, method = c("pearson"))

fo <- salary ~ height
fit <- lm(fo, fakenumbers)
(eq <- paste0(all.vars(fo)[1], ' ~ ', paste0(round(coef(fit), 2),
              gsub('\\*\\(Intercept\\)', '', 
                   paste0('*', names(coef(fit)))), collapse=' + ')))
# [1] "salary ~ -281.58 + salary ~ 2.49*height"

然后在plot()abline()text()中使用变量。

plot(fo, fakenumbers, pch=20, col=4,
     xlab='height (cm)', ylab='Hourly salar (sec)',
     main='Relationship between height and salary (made up data)')
abline(fit, col=4)
text(149, 250, bquote(italic('r=')~.(round(corr, 3))), adj=0, cex=.8)
text(149, 235, eq, adj=0, cex=.8)

在此处输入图像描述


数据:

fakenumbers <- structure(list(salary = c(95, 100, 105, 110, 120, 124, 135, 150, 
165, 175, 225, 230, 235, 260), height = c(160, 150, 182, 165, 
172, 175, 183, 187, 174, 193, 201, 172, 180, 188)), class = "data.frame", row.names = c(NA, 
-14L))

另一种方式:

round(cor(height, salary, method = c("pearson")), 4) -> corr

然后使用geom_text显示相关系数:

r +
  geom_smooth(method = lm, formula = y ~ x, se = FALSE) +
  geom_text(x = 152, y = 250,
            label = paste0('r = ', corr),
            color = 'red')

在此处输入图像描述

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM