简体   繁体   English

我如何使用机器学习预测事件发生的机会?

[英]How can i predict the chance of an event happening using machine learning?

I have a dataset of events including their coordinates, Hour of the day, day of the week, and the weather (temperature and downfall) on that specific day.我有一个事件数据集,包括它们的坐标、一天中的时间、一周中的一天以及特定日期的天气(温度和降雨量)。 My goal is to predict the chance of that event to occur when you input these values.我的目标是预测当您输入这些值时该事件发生的可能性。 So I have a lot of data (about 1500 entries) of occurrences, but obviously, none where it did not happen.所以我有很多发生的数据(大约 1500 个条目),但很明显,没有一个没有发生。

I looked at XGBoost because someone suggested it but I can't really find out how I could use that, how I applied it now just always returns 1.我查看了 XGBoost,因为有人建议使用它,但我真的不知道如何使用它,我现在如何应用它总是返回 1。

This is my current implementation这是我目前的实现

import xgboost as xgb

#RD is my dataframe 

#I labeled everything with a 1 since xgboost needs to predict something. I have no clue how i could handle this better :)
rd["Label"] = 1

X,y=rd[["HourOfDay",'Type','Lat','Long','DayOfWeek']],rd["Label"]
xg_cl = xgb.XGBClassifier(objective="binary:logistic",
                         n_estimators=10, seed=123)
xg_cl.fit(X,y)

testdf = pd.DataFrame({
    "HourOfDay" : [1],
    "Type" : [2],
    "Lat" : [0],
    "Long" : [0],
    "DayOfWeek" : [6]
})
preds=xg_cl.predict(data =testdf)

This code always gives me true (1) but I need it to return the chance of the event happening and I am pretty sure my current implementation is useless.这段代码总是给我真实的(1),但我需要它来返回事件发生的机会,我很确定我当前的实现是无用的。

Can somebody point me in the right direction on how to solve this issue?有人可以指出我如何解决这个问题的正确方向吗?

When you label every data point with a 1 and train your model on those, of course the classifier will only predict 1 .当您 label 每个数据点都带有1并在这些数据点上训练您的 model 时,当然分类器只会预测1 It has no idea another class even exists.它不知道另一个 class 甚至存在。 You need different labelled examples in your training data in order to properly fit a classifier.您需要在训练数据中使用不同的标记示例才能正确拟合分类器。

Besided that, there is a predict_proba function that returns the class probabilities rather than the classes.除此之外,还有一个predict_proba function 返回 class 概率而不是类。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM