简体   繁体   English

如果不是,可以使用scikit-learn而不是二进制分类来预测变量

[英]can one predict variable using scikit-learn rather binary classification if yes than how

I am working in the field of Pharmaceutical sciences, I work on chemical compounds and with calculating their chemical properties or descriptors we can predict certain biological function of that compounds. 我从事药物科学领域的工作,研究化合物,通过计算其化学性质或描述符,我们可以预测化合物的某些生物学功能。 I use python and R programming language for the same and also use Weka machine learning tool. 我使用相同的python和R编程语言,也使用Weka机器学习工具。 Weka provides facility for binary prediction using SVM and other supporting algorithms. Weka为使用SVM和其他支持算法的二进制预测提供了便利。

Ex data set: Training set 防爆数据集: 训练集

Chem_ID   MW LogP HbD HbE IC50 Class_label
  001    232  5    0   2    20    0
  002    280  2    1   4    41    1
  003    240  5    0   2    22    0
  004    300  4    1   5    48    1
  005    245  2    0   2    24    0
  006    255  1    0   2    20    0
  007    299  5    1   4    49    1

Test set 测试集

Chem_ID  MW   LogP HbD HbE IC50 Class_label
    000   255  1    0   2    20    

In weka there are few algorithm with them we can predict the "class_label" or we can also predict specific variable (we usually predict "IC50" values ), does scikit-learn or any other machine learning library in python having that capabilities. 在weka中,很少有算法可以预测“ class_label”,也可以预测特定变量(我们通常预测“ IC50”值),scikit-learn或python中具有此功能的任何其他机器学习库都可以。 if yes how can we use it thanks. 如果是,我们如何使用它,谢谢。

Yes, this is a regression problem. 是的,这是一个回归问题。 There are many different models to solve a regression problem, from a simple Linear Regression , to Support Vector Regression or Decision Tree Regressors (and many more). 从简单的线性回归支持向量回归决策树 回归 (还有更多),有许多不同的模型可以解决回归问题。

They work similarly to binary classifier: You give them your training data and instead of 0/1 labels you give them target values to train. 它们的工作方式类似于二进制分类器:您给他们训练数据,而不是0/1标签,而是给他们训练目标值。 In your case you would take the feature you want to predict as target value and delete it form the training data. 在您的情况下,您将要预测的特征作为目标值并将其从训练数据中删除。

Short example: 简短示例:

target_values = training_set['IC50']
training_data = training_set.drop('IC50')

clf = LinearRegression()
clf.fit(training_data, target_values)

test_data = test_set.drop('IC50')

predicted_values = clf.predict(test_data)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用scikit-learn对二进制数据集进行分类? - How to do classification in binary data set using scikit-learn? 如何使用scikit-learn预测具有分类和连续特征的二进制结果? - how to predict binary outcome with categorical and continuous features using scikit-learn? Scikit-learn:使用均值(而不是中位数)的 MAE 标准 - Scikit-learn: MAE criterion using mean (rather than median) 如何在 scikit-learn 中预测时间序列? - How to predict time series in scikit-learn? 怀疑使用 scikit-learn RandomForestClassifier 过度拟合二元分类玩具问题 - Suspect overfitting binary classification toy problem with scikit-learn RandomForestClassifier scikit-learn中二进制分类的权重和偏差量 - Dimension of weights and bias for binary classification in scikit-learn scikit-learn中的简单分类 - Simple classification in scikit-learn 使用Scikit学习进行文本分类 - Text classification with Scikit-learn 如何告诉scikit-了解给出F-1 /精确/召回得分的标签(二进制分类)? - How to tell scikit-learn for which label the F-1/precision/recall score is given (in binary classification)? 如何创建虚拟变量,然后使用 scikit-learn 进行聚合? - How to create dummy variable and then aggregate using scikit-learn?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM