簡體   English   中英

R編程:使用自定義函數進行邏輯回歸

[英]R Programming: logistic regression using a custom function

我對使用橢圓函數在R中進行邏輯回歸感興趣。

例如,如果我的特征向量是[x1 x2 x3] ,則需要

glm(y ~ x1*x1 + x1*x2 + x1*x3 + x2*x1 * x2*x2 + x2*x3 + x3*x1 + x3*x2 + x3*x3, 
    data = myDataFrame.df, family=binomial)

這是等效的。

glm(y ~ x1*x1 + x1*x2 + x1*x3 + x2*x2 + x2*x3 + x3*x3, 
    data = myDataFrame.df, family=binomial) 

給定輸入向量c(x1,x2,x3)對應於data.frame的列,是否有辦法生成該方程? 謝謝亞當

這是我試圖回答您的問題:

# for reproducibility (data generation only) 
set.seed(123)
# sample size
Nsims <- 1e3
# generate data
df <- data.frame(x1=rnorm(Nsims, 0, 2), 
                 x2=runif(Nsims, -2, 2), 
                 x3=rnorm(Nsims))
# generate response
df[, "y"] <- rbinom(Nsims, 1, with(df, exp(x1+3*x2+2*x3+x2*x3+0.5*x3^2)/(1+exp(x1+3*x2+2*x3+x2*x3+0.5*x3^2))))

# glm (your version)
glm(y ~ x1*x1 + x1*x2 + x1*x3 + x2*x2 + x2*x3 + x3*x3, 
    data =df, family=binomial) 
# equivalent but simpliried version
glm(y ~  x1*x2 + x1*x3 + x2*x3, 
    data = df, family=binomial) 
# another equivalent version 
# I think this is what you mean by your input vector
vars <- names(df)[names(df) != "y"] 
# write down the formula
form <- paste("y ~", do.call(paste, c(as.list(do.call(paste, c(expand.grid(vars, vars), sep="*"))), sep=" + ")))
# use this formula
glm(form, data = df, family=binomial) 

# different version that includes x1^2, x2^2 and x3^2 
# you had these terms in the original version, but they weren't used 
# because you didn't put them in the I(...) notation
glm(y ~ x1*x2 + x1*x3 + x2*x3 + I(x1*x1) + I(x2*x2)  + I(x3*x3), 
    data = df, family=binomial) 
# and here's the same thing using the poly function (as @Ben Bolker suggested)
# this used the option raw=TRUE for the output to be comparable with the above, 
# however usually you should use ortogonal polynomials instead. 
glm(y ~ poly(x1, x2, x3, degree=2, raw=TRUE), data=df, family=binomial)

# yet another version without the non-squared terms (as @ Aaron suggested)
glm(y ~ x1:x2 + x1:x3 + x2:x3 + I(x1*x1) + I(x2*x2)  + I(x3*x3), 
    data = df, family=binomial) 
# you can also use define a formula similar to the one I suggested in the first version. 

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM