简体   繁体   English

R:将变量传递给lm内部函数

[英]R: pass variable to lm inside function

I want to write a function that calls lm and plots a scatter dot with a regression line using ggplot2 . 我想编写一个调用lm并使用ggplot2用回归线绘制散点的ggplot2

Coming from here , this is my code: 这里开始 ,这是我的代码:

fun <- function(m, n, o, p) {
  library(ggplot2)
  data <- as.data.frame(read.table(file = m, header = T, dec = ".", sep = "\t" ))
  fit <- lm(as.formula(n ~ 0 + o), data)
  text<-paste("R^2 = ", summary(fit)$adj*100, " %; coefficient = ", fit$coefficients, sep="")
  ggplot(data, aes(x=!!rlang::enquo(o), y = !!rlang::enquo(n))) + geom_point(aes(colour = !!rlang::enquo(p))) + geom_abline(intercept=0, slope=fit$coefficients[1], color='#2C3E50', size=1.1) + geom_text(aes(x = 1, y = 1, label = text))
}

An exemplary input file: 示例输入文件:

columna columnb string
3338240000  97.65   ccc
3453970000  98.8    ccc
3559920000  99.5    aaa
1434400000  87.8    ccc
2953560000  99.8    ccc
3172212857  99.15   ccc
3286080000  99.3    ccc
3750630000  99.3    ccc
4215180000  99.7    ccc
2836200000  99.85   ccc
229830000   93.8    rrr
39120000    94.5    ppp
1770180000  99  ccc

When I call the function with 当我用

fun("input", columna, columnb, string)

I get an error. 我得到一个错误。 How do I pass variables (column names) correctly to lm inside a function? 如何在函数内将变量(列名)正确传递给lm?

The main problem is that you are trying to use non-standard evaluation, and that can be tricky. 主要问题是您正在尝试使用非标准评估,这可能很棘手。 It's easier if you just put the column names in quotes, though still a little tricky, because you need to create the formula to send to lm() . 如果仅将列名放在引号中,则会更容易,尽管仍然有些棘手,因为您需要创建公式以发送给lm() For example, this code would work if n and o were strings naming the columns instead of unquoted column names: 例如,如果no是命名列而不是未引用的列名的字符串,则此代码将起作用:

fla <- substitute(n ~ 0 + o, list(n = as.name(n), o = as.name(o)))
fit <- lm(fla, data)

You also need to modify the ggplot2 call. 您还需要修改ggplot2调用。 This seems to work, but I don't know ggplot2 well enough to know if it's the "right" way to do it: 这似乎可行,但是我对ggplot2了解不够,无法确定这是否是“正确”的方法:

  ggplot(data, aes(x=data[[o]], y = data[[n]])) + 
    geom_point(aes(colour = data[[p]])) + 
    geom_abline(intercept=0, slope=fit$coefficients[1], color='#2C3E50', size=1.1) + 
    geom_text(aes(x = 1, y = 1, label = text)) +
    labs(x = o, y = n, color = p) 

With these changes, you should be able to call fun with quoted names, eg 进行了这些更改后,您应该可以使用带引号的名称来称呼fun ,例如

fun("input", "columna", "columnb", "string")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM