[英]can one predict variable using scikit-learn rather binary classification if yes than how
I am working in the field of Pharmaceutical sciences, I work on chemical compounds and with calculating their chemical properties or descriptors we can predict certain biological function of that compounds. 我从事药物科学领域的工作,研究化合物,通过计算其化学性质或描述符,我们可以预测化合物的某些生物学功能。 I use python and R programming language for the same and also use Weka machine learning tool.
我使用相同的python和R编程语言,也使用Weka机器学习工具。 Weka provides facility for binary prediction using SVM and other supporting algorithms.
Weka为使用SVM和其他支持算法的二进制预测提供了便利。
Ex data set: Training set 防爆数据集: 训练集
Chem_ID MW LogP HbD HbE IC50 Class_label
001 232 5 0 2 20 0
002 280 2 1 4 41 1
003 240 5 0 2 22 0
004 300 4 1 5 48 1
005 245 2 0 2 24 0
006 255 1 0 2 20 0
007 299 5 1 4 49 1
Test set 测试集
Chem_ID MW LogP HbD HbE IC50 Class_label
000 255 1 0 2 20
In weka there are few algorithm with them we can predict the "class_label" or we can also predict specific variable (we usually predict "IC50" values ), does scikit-learn or any other machine learning library in python having that capabilities. 在weka中,很少有算法可以预测“ class_label”,也可以预测特定变量(我们通常预测“ IC50”值),scikit-learn或python中具有此功能的任何其他机器学习库都可以。 if yes how can we use it thanks.
如果是,我们如何使用它,谢谢。
Yes, this is a regression problem. 是的,这是一个回归问题。 There are many different models to solve a regression problem, from a simple Linear Regression , to Support Vector Regression or Decision Tree Regressors (and many more).
从简单的线性回归到支持向量回归或决策树 回归 (还有更多),有许多不同的模型可以解决回归问题。
They work similarly to binary classifier: You give them your training data and instead of 0/1 labels you give them target values to train. 它们的工作方式类似于二进制分类器:您给他们训练数据,而不是0/1标签,而是给他们训练目标值。 In your case you would take the feature you want to predict as target value and delete it form the training data.
在您的情况下,您将要预测的特征作为目标值并将其从训练数据中删除。
Short example: 简短示例:
target_values = training_set['IC50']
training_data = training_set.drop('IC50')
clf = LinearRegression()
clf.fit(training_data, target_values)
test_data = test_set.drop('IC50')
predicted_values = clf.predict(test_data)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.