When Using logistic regression in R
, the data input for the 'glm' function (family = binomial) can be: (?family) in several formats, and specifically in the format of:
......
For the binomial and quasibinomial families the response can be specified in one of three ways:
......
As a numerical vector with values between 0 and 1, interpreted as the proportion of successful cases (with the total number of cases given by the weights)....
I have aggregated data that represents proportion of success out of trials (number between 0 and 1) and their equivalent weights, I'm interested in applying logistic regression with it, which would be trivial to use in R.
Unfortunately i cant use R in this project, and would like to use scikit-learn
to estimate the logistic regression coefficients . More precise, i'm looking to apply the sklearn.linear_model.LogisticRegression
in a form of input that will allow me to insert the model proportions and wights, in a similar fashion as available in R.
example:
from sklearn import linear_model
import pandas as pd
df = pd.DataFrame([[1,1,1,0], [1,1,1,0],[1,1,1,1],[2,2,1,1] , [2,2,1,1],[2,2,1,0] , [3,3,1,0] ],columns=['a', 'b','Trials','Success'])
logistic = linear_model.LogisticRegression()
#this works
logistic.fit(X=df[['a','b','Trials']] , y=df.Success)
logistic.predict_proba(df[['a','b','Trials']])
prob_to_success = logistic.predict_proba(df[['a','b','Trials']])[:,1]
prob_to_success
Out[51]: array([ 0.45535843, 0.45535843, 0.45535843, 0.42212169, 0.42212169,
0.42212169, 0.38957565])
#How can i use the following Data?
df_agg = df.groupby(['a','b'] , as_index=False)['Trials','Success'].sum()
df_agg["Prop"] = df_agg.Success / (df_agg.Trials)
df_agg
#I want to use Prop & Trials as weights in df_agg
Thanks in advance!
Convert to log-odds form and use linear regression on the transformation. Sklearn doesn't seem to have a quasi-binomial conversion for logistic regression. As you said, trivial in R but sklearn seems to not have anything of the sort.
如果要使用权重,可以在LogisticRegression
的拟合函数中使用它们:
fit(X, y, sample_weight=None)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.