I have a data set I have made with random numbers containing the sales data for each sales representative for all previous months and I want to know if there is a way to predict what the sales would look like for each representative for the upcoming month. I'm not sure if machine learning methods are something that can be used here.
I am mostly asking for the best way to solve this, not necessary a code but maybe a method that is best for these types of questions. This is something I am interested in and would like to apply to a bigger data sets in the future.
data = [[1 , 55, 12, 25, 42, 66, 89, 75, 32, 43, 15, 32, 45],
[2 , 35, 28, 43, 25, 54, 76, 92, 34, 12, 14, 35, 63],
[3 ,13, 31, 15, 75, 4, 14, 54, 23, 15, 72, 12, 51],
[4 ,42, 94, 22, 34, 32, 45, 31, 34, 65, 10, 15, 18],
[5 ,7, 51, 29, 14, 92, 28, 64, 100, 69, 89, 4, 95],
[6 , 34, 20, 59, 49, 94, 92, 45, 91, 28, 22, 43, 30],
[7 , 50, 4, 5, 45, 62, 71, 87, 8, 74, 30, 3, 46],
[8 , 12, 54, 35, 25, 52, 97, 67, 56, 62, 99, 83, 9],
[9 , 50, 75, 92, 57, 45, 91, 83, 13, 31, 89, 33, 58],
[10 , 5, 89, 90, 14, 72, 99, 51, 29, 91, 34, 25, 2]]
df = pd.DataFrame (data, columns = ['sales representative ID#',
'January Sales Quantity',
'Fabruary Sales Quantity',
'March Sales Quantity',
'April Sales Quantity',
'May Sales Quantity' ,
'June Sales Quantity',
'July Sales Quantity',
'August Sales Quantity',
'September Sales Quantity',
'October Sales Quantity',
'November Sales Quantity',
'December Sales Quantity'])
Your case with multiple sales representatives is more complex, because since they are responsible for the same product, there may be a complex correlation between their performance, besides seasonality, autocorrelation, etc. Your data is not even a pure time series — it rather belongs to the class of so called "panel" datasets. I've recently written a Python micro-package salesplansuccess
, which deals with prediction of the current (or next) year's annual sales from historic monthly sales data. But a major assumption for that model is a quarterly seasonality (more specifically a repeating drift from the 2nd to the 3rd month in each quarter), which is more characteristic for wholesalers. The package is installed as usual with pip install salesplansuccess
. You can modify its source code for it to better fit your needs. The minimalistic use case is below:
import pandas as pd
from salesplansuccess.api import SalesPlanSuccess
myHistoricalData = pd.read_excel('myfile.xlsx')
myAnnualPlan = 1000
sps = SalesPlanSuccess(data=myHistoricalData, plan=myAnnualPlan)
sps.fit()
sps.simulate()
sps.plot()
For more detailed illustration of its use, you may want to refer to a Jupyter Notebook illustration file at its GitHub repository .
Choose method of prediction and iterate over reps calculating their parameters. Here you have simple linear regression in python you can use. With time you can add something smarter.
#!/usr/bin/python
data = [[1 , 55, 12, 25, 42, 66, 89, 75, 32, 43, 15, 32, 45],
(...)
months = []
for m in range(len(data[0])):
months.append(m+1)
for rep in range(len(data)):
linear_regression(months, data[rep])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.