简体   繁体   中英

How to add trend line in a log-log plot (ggplot2)?

I need plot a data vector, which follow power law distribution. so if I plot them on log-log axis, and they will be a straight line. However, if I do not explicitly provide "y" parameter, I do not know how to plot. this is code

library("poweRlaw")
library("ggplot2")

xmin = 1; alpha = 1.5
con_rns = rplcon(1000, xmin, alpha)
#convert to data.frame format for ggplot2
df <- data.frame(con_rns =con_rns[con_rns<1000])

#make plot with both axes log scale
ggplot(data = df, aes(x = con_rns))+
  geom_point(stat = 'bin', binwidth = 0.1)+
  geom_smooth(stat = 'bin',mapping = aes(x=con_rns),method = "lm",se=FALSE)+
  scale_x_log10() + 
  scale_y_log10()

The result is this:

在此输入图像描述

But I want this

在此输入图像描述

I know, I can manually bin data, provide "y" explicitly and then plot the line, like this

ggplot(data = data.frame(a = rnorm(50,0,1),b=5+rnorm(50,2,1)),mapping = aes(x = a,y=b))+
  geom_point()+
  geom_smooth(method = "lm",se=FALSE)

result:

在此输入图像描述

But I want to know, how can I plot trend line with this code ( geom_point(stat = 'bin', binwidth = 0.1) ). It implicitly calculates data bin.

PS: Well, thanks for Chris's answer. I still have a problem. If I want to plot different group, how can I draw it? The data are df <- data.frame(con_rns =con_rns[con_rns<1000],col=sample(1:3,size = length(con_rns[con_rns<1000]),replace = T)) . How can I plot different color point group and color line group in log-log axis? like this:

One way would be to recover the binned data from the plot using ggplot_build()

first I made the plot without the line of best fit:

p <- ggplot(data = df, aes(x = con_rns))+
  geom_point(stat = 'bin', binwidth = 0.1)+
  scale_x_log10() + 
  scale_y_log10() 

Then I added the binned data from the plot which can be found with ggplot_build(p)$data (and reversed the log10 transformation)

p + geom_smooth(data = ggplot_build(p)$data[[1]], 
              mapping = aes(x=10^x, y= 10^y),method = "lm",se=FALSE)

在此输入图像描述

UPDATE: The additional problem was how to split the plot by different colour groups. I approached this in the same way but it was necessary for me to create a 'group' aesthetic so this data could be kept in the ggplot_build data.

library(poweRlaw)
library(ggplot2)

xmin = 1; alpha = 1.5
con_rns = rplcon(1000, xmin, alpha)
#convert to data.frame format for ggplot2
df <- data.frame(con_rns =con_rns[con_rns<1000],col=sample(1:3,size = length(con_rns[con_rns<1000]),replace = T))

p <- ggplot(data = df, aes(x = con_rns))+
  geom_point(stat = 'bin', binwidth = 0.1, aes(colour=factor(col), group=factor(col)))+
  scale_x_log10() + 
  scale_y_log10() 


p + geom_smooth(data = ggplot_build(p)$data[[1]], 
                mapping = aes(x=10^x, y= 10^y, colour=factor(group)),method = "lm",se=FALSE)

Note that now we have grouped the data, some of the groups have a count of zero in their bin. This returns a warning when the log10 transformation is applied to zero, giving an infinite value. These points are removed from the plot and ignored in the trend lines.

在此输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM