简体   繁体   中英

How to train ML model to get more than one possible output?

I have been trying to search about this question but cannot get an answer. Is it possible to have multiple possible output using classification algorithm in python?

I've been working on a database of agricultural crop where the model will predict which crop will be suitable for the land using inputs like pH,soil type, temperature and humidity in this case more than one crop can have similar pH and soil type. so, I want my model to return all the possible outcomes.

So, could you please tell me if it's possible? And if it is how do i get the output?

  Temp Humidity Moisture    Stype   suitable-crop      Ph
0   26    52        38     Sandy      Maize         Slightly acidic
1   32    62        34      Red     Ground Nuts     Neutral
2   29    52        45     Loamy     Sugarcane      Slightly alkaline
3   34    65        62     Black      Cotton        Moderately acidic
4   26    50        35     Sandy      Barley        slightly acidic

Given above is the sample data. The target here is the 'suitable-crop'. As you can see both maize and barley both almost have the same reuirements for temperature, humidity, soiltype(Stype) and ph. I've used random forest algorithm to predict the output.

Enter the soil type:sandy
Enter the pH type:slightly acidic
Enter temperature:27
Eneter Humidity:50
Enter moisture:37
result = model.predict([[Stype1, pH1, Temperature, Humidity, Moisture]])
print(result)

['Sugarcane'] 

This was the result. I want my model to return all the possible suitable output like barley and maize as well.

Yes, this is possible. The scikit-learn user guide has a good discussion (in the context of decision trees, but the following quote applies more generally):

A multi-output problem is a supervised learning problem with several outputs to predict, that is when Y is a 2d array of size [n_samples, n_outputs] .

When there is no correlation between the outputs, a very simple way to solve this kind of problem is to build n independent models, ie one for each output, and then to use those models to independently predict each one of the n outputs. However, because it is likely that the output values related to the same input are themselves correlated, an often better way is to build a single model capable of predicting simultaneously all n outputs. First, it requires lower training time since only a single estimator is built. Second, the generalization accuracy of the resulting estimator may often be increased.

Whether the second approach is feasible may depend on the classification algorithm you are using. Those algorithms that do natively support multi-output classification will usually have it implemented in scikit-learn. Again, the user guide is a good place to start.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM