如果不是，可以使用scikit-learn而不是二进制分类来预测变量

Question

I am working in the field of Pharmaceutical sciences, I work on chemical compounds and with calculating their chemical properties or descriptors we can predict certain biological function of that compounds. 我从事药物科学领域的工作，研究化合物，通过计算其化学性质或描述符，我们可以预测化合物的某些生物学功能。 I use python and R programming language for the same and also use Weka machine learning tool. 我使用相同的python和R编程语言，也使用Weka机器学习工具。 Weka provides facility for binary prediction using SVM and other supporting algorithms. Weka为使用SVM和其他支持算法的二进制预测提供了便利。

Ex data set: Training set 防爆数据集： 训练集

Chem_ID   MW LogP HbD HbE IC50 Class_label
  001    232  5    0   2    20    0
  002    280  2    1   4    41    1
  003    240  5    0   2    22    0
  004    300  4    1   5    48    1
  005    245  2    0   2    24    0
  006    255  1    0   2    20    0
  007    299  5    1   4    49    1

Test set 测试集

Chem_ID  MW   LogP HbD HbE IC50 Class_label
    000   255  1    0   2    20

In weka there are few algorithm with them we can predict the "class_label" or we can also predict specific variable (we usually predict "IC50" values ), does scikit-learn or any other machine learning library in python having that capabilities. 在weka中，很少有算法可以预测“ class_label”，也可以预测特定变量（我们通常预测“ IC50”值），scikit-learn或python中具有此功能的任何其他机器学习库都可以。 if yes how can we use it thanks. 如果是，我们如何使用它，谢谢。

Answer 1

Yes, this is a regression problem. 是的，这是一个回归问题。 There are many different models to solve a regression problem, from a simple Linear Regression , to Support Vector Regression or Decision Tree Regressors (and many more). 从简单的线性回归到支持向量回归或决策树回归（还有更多），有许多不同的模型可以解决回归问题。

They work similarly to binary classifier: You give them your training data and instead of 0/1 labels you give them target values to train. 它们的工作方式类似于二进制分类器：您给他们训练数据，而不是0/1标签，而是给他们训练目标值。 In your case you would take the feature you want to predict as target value and delete it form the training data. 在您的情况下，您将要预测的特征作为目标值并将其从训练数据中删除。

Short example: 简短示例：

target_values = training_set['IC50']
training_data = training_set.drop('IC50')

clf = LinearRegression()
clf.fit(training_data, target_values)

test_data = test_set.drop('IC50')

predicted_values = clf.predict(test_data)

如果不是，可以使用scikit-learn而不是二进制分类来预测变量

问题描述

1 个解决方案

解决方案1
2 已采纳 2016-02-04 13:59:08

如果不是，可以使用scikit-learn而不是二进制分类来预测变量

问题描述

1 个解决方案

解决方案1 2 已采纳 2016-02-04 13:59:08

解决方案1
2 已采纳 2016-02-04 13:59:08