简体   繁体   中英

How to set up a machine learning model with no data

I know what you're thinking. This is a silly question. I think so too but our company is very young and has almost no data whatsoever to train the model. I'm assigned to just set up a model infrastructure to take inputs and output decisions (doesn't have to be accurate for now). When we have the infrastructure in place, we will look into collecting or buying data to feed through the model.

In my opinion, this process is kinda backward but it's the way my boss wants it so I gotta deliver.

My goal is to build a machine learning model (random forest, boosting, logistic regression,etc.) for a set of features as following: - Target: binary - Features: A (binary), B (categorical with 4 classes), C (numeric), D (binary)

As I have no training data, I can't follow the traditional route of splitting train/test, fit, cross-validate, and arrive at the optimal model. How do I simply arrive at a model with dummy coefficients without training it? I will then serialize the ML model into a pickle file to plug it in a Flask app to give a decision. Thanks, guys!

Even if you have a small dataset, you can use oversampling to enrich it - and even build a model. This model, of course, will not be very predictive - but you will continue training it with time. Note that oversampling and later using cross-validations will give you overfitting - so don't get too excited when you see the results...

See how to use oversampling here .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM