简体   繁体   English

如何提高入住率预测的准确性?

[英]How to increase accuracy of occupancy prediction?

I have a project that's aimed to predict the amount of occupants at my local gym given the date and weather.我有一个项目,旨在根据日期和天气预测我当地健身房的入住人数。

Here's my Kaggle kernel 这是我的 Kaggle kernel

I have two datasets, occupants on a given hour and weather on a given hour.我有两个数据集,给定时间的居住者和给定时间的天气。 My process is that I combine these two datasets, and using Occupants as the target.我的过程是结合这两个数据集,并使用 Occupants 作为目标。 However, when I implement a regression algorithm I can only reach a prediction score of 57%.但是,当我实现回归算法时,我只能达到 57% 的预测分数。

I'd love any advice on how to modify my solution to achieve better predictions?我想要任何关于如何修改我的解决方案以实现更好预测的建议?

Thank you.谢谢你。

To Improve the accuracy:提高准确性:

  • Do more feature engineering.做更多的特征工程。
  • You can convert your categorical variable into one hot encoded values.您可以将分类变量转换为一个热编码值。
  • You can make one more feature stating morning, afternoon, evening, night using hour from timestamp and also you can add 'weekday/weekend' column.您可以使用时间戳中的小时制作更多功能,说明早上、下午、晚上、晚上,也可以添加“工作日/周末”列。
  • You have not done EDA much.你没有做太多的EDA。 Check for outliers and remove them.检查异常值并将其删除。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM