SALES FORECASTING USING ARIMA AND PROPHET

A power planning tool to prepare for future possibilities
Introduction
Sales Forecasting is a fore prediction of sales, not only sales but predicting any quantitative value like order or count with respect to future .The forecasting is a process which allow any business to predict future sales revenue for specific period of time. The time period of forecast could be weekly, quarterly, semi annually, annually.
For analysing the trend and predicting the future revenues, time adds up to be a consequential factor. For evaluating future requirements, one must understand time series.
A time series can keep tracks of the movement of the chosen data points, such as high price or low price of something, over a specified period of time and those data points could be recorded at regular intervals.
Why Sales Forecasting?
insights for future Possibilities
Business Planning Tool
Anticipated Upswings in Sales
Reasons for Forecasting/ Predictions
A commercial value gets produced when we forecast sales or demand or likewise output on the time series. The basic business planning begins with acquisition, collection, gains, comparison which is firmly based on the predicted values using historical dataset or time series.
Sales Forecasting aids the business with a set of forecasted value of sales/order it is expecting the next month or the next year.

Forecast would help the manufacturing company to cut down or buck up the productions, material required etc.

Forecasting could be critical for cost cutting or for the success of he company.
Actually not only in manufacturing or business, also in a situation like todays global pandemic or medically centred.
Different Business uses different techniques like regression analysis to predict or forecast future values. Other intelligent techniques like Arima or Prophet algrothimic approaches helps to find out the trend in the values or the trend that will play out in future or how it can impact future value.
Sales Forecasting can manage several objectives
Budgeting  Predicting future requirement values is a major role of sales Forecasting. Meanwhile it helps the business owner to make critical decisions regarding the material, payrolls, rental costs in order to do the cost cutting in a low profit period. A static budget could be made based on the reasonable idea after the forecasted value. Efficient work methods and proper capital management could be done to start a project based on the the sales forecasting.
Resource allocation  Business can simply manage internal resources, sales force or the cash flow. By shifting focus in advance, the business owners could avoid problems. By tracking the critical levels and trends, they could make a plan to reach the culmination point for their sales over a period of time.
Investors  Potential investors are the real source of gains and through the forecasting, what kind of RoI is to be generated from potential investors is obtained before hand. And therefore additional funding is secured by the business owner.
Production or Purchase  A check or production levels and material purchasing is made easy and cost cutting becomes effective to make a huge profit. Lets say, a product goes out of demand in summer, a relatable drop off in the material purchase could be made due to the lower demand of the product.
Staff Management  Just like the tentative plans made for production and purchase management, active staffing levels could be determined as per the forecasted sales. A heavy demand in business might lead the business owner to hire additional employees while, in the low anticipated time period, the owner would trim the working hours or slow down the work rate rather than just reducing the total staff.
What is Arima model and how does it work ?
AR + I + MA = Auto Regressive + integrated + Moving Average
ARIMA is a model which could explain a provided time series based upon the past values, it has its own lags, lagged error (forecasted ), its basically an equation to forecast future data or future values.
ARIMA( p,q,d)
Where p, q and d are hyper parameters
The terms p , q and d must be tuned accurately .
P = the order of AR term
q= the order of MA term
And
d= difference required to make the time series stationary
To decide the algorithm to be used, the understanding of data prerequisite. If the data shows stationarity, ARIMA is the algorithm without a doubt.
Constant mean, constant variance or auto covariance could be used for the stationarity of time series for Detecting Stationarity in time series.
Implementation
1 . Checking the stationarity :
 Rolling statistics
First we check for the moving average or moving variance with time using given function .rolling.mean()
The question arises, why do we use rolling statistics ?
Because Computing rolling average could be beneficial in a way to find trends that would otherwise be really tough to see and hard to detect. Using the data from above, the obtained graph could look like the one above shown.
Rolling mean or moving mean is a type of finite impulse response filter. Its nothing but a series of averages of disparate subsets of whole dataset.
Approaches to make the time series stationarity
One of the easiest way is to take the difference, that is to subtract the previous value from the current value. Depending on the complexity shown by the time series, more than one time difference could be taken.
Note : If the time series is already stationary , then d= 0
While deciding the 'd', one must tune p and q terms as well.
P is the order of AR term of ARIMA, it is the number of lags Y used as predictors and q is the order of MA term in ARIMA, which gives the number of lagged forecast errors.
AR model and MA model
A pure AR model is given by :
Where Yt be the prediction at time t, depending on some constant alpha or say intercept term , beta1 is the coefficient of lag 1 and epsilon be the error.
A pure MA model is given by :
Where Yt is the predicted output at timestamp t , where alpha be the intercept, and other error terms from respective lags.
Combining both gives ARIMA Model
Another way to understand data and its stationarity is DICKEY FULLER TEST
The null hypothesis however is still the same as the Dickey Fuller test. A key point to remember here is: Since the null hypothesis assumes the presence of unit root, that is α=1, the pvalue obtained should be less than the significance level (say 0.05) in order to reject the null hypothesis. Thereby, inferring that the series is stationary.
To decide the order of p, q and d before applying ARIMA, one can compute the orders using ACF and PACF.
ACF is an (complete) autocorrelation function which gives us values of autocorrelation of any series with its lagged values.
In time series analysis, the partial autocorrelation function (PACF) gives the partial correlation of a stationary time series with its own lagged values, regressed the values of the time series at all shorter lags.
from statsmodels.tsa.stattools import acf, pacf
lag_acf = acf(datasetLogDiffShift, nlags=20)
lag_pacf = pacf(datasetLogDiffShift, nlags=20, method=‘ols')
Applying ARIMA model with different respective values from graphical representational value of ACF and PACF.
from statsmodels.tsa.arima_model import ARIMA
model = ARIMA(mydata_logScale, order=(2, 1, 2))
model1 = ARIMA(mydata_logScale, order=(0, 1, 2))
graph of order of (0,1,2)
results_AR = model1.fit(disp=1)
plt.plot(datasetLogDiffShift)
Predictions using ARIMA
PROPHET MODEL FOR TIME SERIES FORECASTING
Prophet is a procedure for forecasting time series data based on an additive model where nonlinear trends are fit with yearly, weekly and daily seasonality plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data.
Note : the columns name of ‘timestamp ‘ column which captures date and time and 'resultant forecast ‘column must be changed to ‘ds’ and ‘Y’ respectively.
myprophetdata.head()
df=pd.DataFrame()
df['ds']=dataset['Month']
df['y']=mydata['Sales']
df.head()
By applying shift() we can try to make the data stationary. Remember that its not necesary that after applying shift, the timestamp data becomes stattionary. The nonstationary of data might be the result of anomalous behaviour of data in time. Applying multiple shifts might or might not make the data staionary. There exists other methods for data stationarity as per the requirements given by trends or seasonality present in data.
After importing the library :
From fbprophet import Prophet
Fitting the data with the easiest commands :
Model = Prophet ()
Model.fit(df)
In thie below given lines of code, we are creating a pandas dataframe with 10 P(periods = 10),
where future data points are working over a monthly frequency (freq = ‘m’). If working with daily data,we wouldn’t want include freq=’m’
future_dates = Model.make_future_dataframe(periods=10, freq = 'm')
Predictions
Predictions = Model.predcit(future_dates)
Predictions.head()
Note : Predictions would give you different values of y like trend , yhat_lower, yhat_upper, additive_terms, additive_terms_lower etc.
The important ones (for now) are ‘ds’ (datetime), ‘yhat’ (forecast), ‘yhat_lower’ and ‘yhat_upper’ (uncertainty levels).
We can visualise and predict the value of y (sales or order or demand) by changing the
make_future_dataframe(periods, frequency) hyperparameters for a quarter or next three quarters, yearly or half yearly etc.