简体   繁体   中英

R: pass variable to lm inside function

I want to write a function that calls lm and plots a scatter dot with a regression line using ggplot2 .

Coming from here , this is my code:

fun <- function(m, n, o, p) {
  library(ggplot2)
  data <- as.data.frame(read.table(file = m, header = T, dec = ".", sep = "\t" ))
  fit <- lm(as.formula(n ~ 0 + o), data)
  text<-paste("R^2 = ", summary(fit)$adj*100, " %; coefficient = ", fit$coefficients, sep="")
  ggplot(data, aes(x=!!rlang::enquo(o), y = !!rlang::enquo(n))) + geom_point(aes(colour = !!rlang::enquo(p))) + geom_abline(intercept=0, slope=fit$coefficients[1], color='#2C3E50', size=1.1) + geom_text(aes(x = 1, y = 1, label = text))
}

An exemplary input file:

columna columnb string
3338240000  97.65   ccc
3453970000  98.8    ccc
3559920000  99.5    aaa
1434400000  87.8    ccc
2953560000  99.8    ccc
3172212857  99.15   ccc
3286080000  99.3    ccc
3750630000  99.3    ccc
4215180000  99.7    ccc
2836200000  99.85   ccc
229830000   93.8    rrr
39120000    94.5    ppp
1770180000  99  ccc

When I call the function with

fun("input", columna, columnb, string)

I get an error. How do I pass variables (column names) correctly to lm inside a function?

The main problem is that you are trying to use non-standard evaluation, and that can be tricky. It's easier if you just put the column names in quotes, though still a little tricky, because you need to create the formula to send to lm() . For example, this code would work if n and o were strings naming the columns instead of unquoted column names:

fla <- substitute(n ~ 0 + o, list(n = as.name(n), o = as.name(o)))
fit <- lm(fla, data)

You also need to modify the ggplot2 call. This seems to work, but I don't know ggplot2 well enough to know if it's the "right" way to do it:

  ggplot(data, aes(x=data[[o]], y = data[[n]])) + 
    geom_point(aes(colour = data[[p]])) + 
    geom_abline(intercept=0, slope=fit$coefficients[1], color='#2C3E50', size=1.1) + 
    geom_text(aes(x = 1, y = 1, label = text)) +
    labs(x = o, y = n, color = p) 

With these changes, you should be able to call fun with quoted names, eg

fun("input", "columna", "columnb", "string")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM