[英]Cross-validating an ordinal logistic regression in R (using rpy2)
I'm trying to create a predictive model in Python, comparing several different regression models through cross-validation. 我正在尝试在Python中创建一个预测模型,通过交叉验证比较几种不同的回归模型。 In order to fit an ordinal logistic model (
MASS.polr
), I've had to interface with R through rpy2
as follows: 为了适应序数逻辑模型(
MASS.polr
),我必须通过rpy2
与R接口如下:
from rpy2.robjects.packages import importr
import rpy2.robjects as ro
df = pd.DataFrame()
df = df.append(pd.DataFrame({"y":25,"X":7},index=[0]))
df = df.append(pd.DataFrame({"y":50,"X":22},index=[0]))
df = df.append(pd.DataFrame({"y":25,"X":15},index=[0]))
df = df.append(pd.DataFrame({"y":75,"X":27},index=[0]))
df = df.append(pd.DataFrame({"y":25,"X":12},index=[0]))
df = df.append(pd.DataFrame({"y":25,"X":13},index=[0]))
# Loads R packages.
base = importr('base')
mass = importr('MASS')
# Converts df to an R dataframe.
from rpy2.robjects import pandas2ri
pandas2ri.activate()
ro.globalenv["rdf"] = pandas2ri.py2ri(df)
# Makes R recognise y as a factor.
ro.r("""rdf$y <- as.factor(rdf$y)""")
# Fits regression.
formula = "y ~ X"
ordlog = mass.polr(formula, data=base.as_symbol("rdf"))
ro.globalenv["ordlog"] = ordlog
print(base.summary(ordlog))
So far, I have mainly been comparing my models using sklearn.cross_validation.test_train_split
and sklearn.metrics.accuracy_score
, yielding a number from 0 to 1 which represents the accuracy of the training-set model in predicting the test-set values. 到目前为止,我主要使用
sklearn.cross_validation.test_train_split
和sklearn.metrics.accuracy_score
比较我的模型, sklearn.metrics.accuracy_score
的数字从0到1,代表训练集模型预测测试集值的准确性。
How might I replicate this test using rpy2
and MASS.polr
? 如何使用
rpy2
和MASS.polr
复制此测试?
通过使用rms.lrm
重新拟合模型最终解决了问题,该模型提供了validate()
函数(在此示例之后进行解释)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.