简体   繁体   中英

Predicting Sales Data with Python

I have a data set I have made with random numbers containing the sales data for each sales representative for all previous months and I want to know if there is a way to predict what the sales would look like for each representative for the upcoming month. I'm not sure if machine learning methods are something that can be used here.

I am mostly asking for the best way to solve this, not necessary a code but maybe a method that is best for these types of questions. This is something I am interested in and would like to apply to a bigger data sets in the future.

data = [[1 , 55, 12, 25, 42, 66, 89, 75, 32, 43, 15, 32, 45], 
        [2 , 35, 28, 43, 25, 54, 76, 92, 34, 12, 14, 35, 63],
        [3 ,13, 31, 15, 75, 4, 14, 54, 23, 15, 72, 12, 51],
        [4 ,42, 94, 22, 34, 32, 45, 31, 34, 65, 10, 15, 18],
        [5 ,7, 51, 29, 14, 92, 28, 64, 100, 69, 89, 4, 95],
        [6 , 34, 20, 59, 49, 94, 92, 45, 91, 28, 22, 43, 30],
        [7 , 50, 4, 5, 45, 62, 71, 87, 8, 74, 30, 3, 46],
        [8 , 12, 54, 35, 25, 52, 97, 67, 56, 62, 99, 83, 9],
        [9 , 50, 75, 92, 57, 45, 91, 83, 13, 31, 89, 33, 58],
        [10 , 5, 89, 90, 14, 72, 99, 51, 29, 91, 34, 25, 2]]

df = pd.DataFrame (data, columns = ['sales representative ID#',
        'January Sales Quantity',
        'Fabruary Sales Quantity',
        'March Sales Quantity',
        'April Sales Quantity',
        'May Sales Quantity' ,
        'June Sales Quantity',
        'July Sales Quantity',
        'August Sales Quantity',
        'September Sales Quantity',
        'October Sales Quantity',
        'November Sales Quantity',
        'December Sales Quantity'])

Your case with multiple sales representatives is more complex, because since they are responsible for the same product, there may be a complex correlation between their performance, besides seasonality, autocorrelation, etc. Your data is not even a pure time series — it rather belongs to the class of so called "panel" datasets. I've recently written a Python micro-package salesplansuccess , which deals with prediction of the current (or next) year's annual sales from historic monthly sales data. But a major assumption for that model is a quarterly seasonality (more specifically a repeating drift from the 2nd to the 3rd month in each quarter), which is more characteristic for wholesalers. The package is installed as usual with pip install salesplansuccess . You can modify its source code for it to better fit your needs. The minimalistic use case is below:

import pandas as pd
from salesplansuccess.api import SalesPlanSuccess
myHistoricalData = pd.read_excel('myfile.xlsx')
myAnnualPlan = 1000
sps = SalesPlanSuccess(data=myHistoricalData, plan=myAnnualPlan)
sps.fit()
sps.simulate()
sps.plot()

For more detailed illustration of its use, you may want to refer to a Jupyter Notebook illustration file at its GitHub repository .

Choose method of prediction and iterate over reps calculating their parameters. Here you have simple linear regression in python you can use. With time you can add something smarter.

#!/usr/bin/python

data = [[1 , 55, 12, 25, 42, 66, 89, 75, 32, 43, 15, 32, 45], 
        (...)

months = []
for m in range(len(data[0])):
    months.append(m+1)

for rep in range(len(data)):
        linear_regression(months, data[rep])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM