Forecasting Sales & Demand: The Path to an Accurate Prediction

Sales plays an important role in any business. Whether it is for an operational reason like demand forecasting in supply chain management or strategic matters for more general business decision in Sales & Operations Planning (S&OP) like defining the future product portfolio, a good forecast is required. In this article, we present the key challenges associated with developing robust and efficient forecasts. I further describe the intuition behind various forecasting scenarios and how to choose appropriate modeling techniques to achieve the desired accuracy.
Technically Everything Can Be Forecasted
Having a clear picture in mind what we want to forecast and the related business requirements, we can focus on the analytical aspects of time series forecasting. Rob Hyndman, a fellow of the International Institute of Forecasters states that “anything is forecastable, but not everything can be forecasted easily or accurately.” Therefore, we need to ensure that multiple aspects are satisfied. Since every use case is different, a variety of common aspects will be discussed. Those will provide a starting point and intuition about the topic of sales and demand forecasting as a time series forecasting use case.
Sales and Demand Forecasting as a Time Series Problem

What is Time Series Forecasting?
In general, statistical forecasting means to predict the future of our product or service, based on the past. In forecasting, there is the implicit assumption that observable behaviours of the past, that impact time series values, continue into the future.
Naturally, it is impossible to forecast the unpredictable. For instance, in 2019 it was virtually impossible to account for the possibility of travel restrictions due to the Covid-19 pandemic when trying to forecast travel demand for 2020. Forecasting is a tool to predict the ordinary and not the surprising, thus forecasting needs to be accurate in the context of the business nevertheless forecasts do not need to be perfect to be extraordinarily useful for businesses [GluonTS].
As the series of data points is a sequence of observations recorded over a period of time, we are in the discipline of “time series forecasting”. In the example of demand, it is the process of using predictive analytics of historical data to estimate the consumers future demand.
We at foryouandyourcustomers use predictive analytics enabling the value proposition of accurate demand forecasting by implementing the following key aspects:
Data-driven and automated forecasting: Moving from manual forecasting to a data-driven and automated forecasting approach enables objectivity, improved accuracy and efficiency as well as scalability and transparency in production environments (see benefits of data-driven decision making).
Extensive selection of statistical models: Using an extensive selection of statistical models to meet the specific requirements of any use case (e.g., of the family of ARIMA, ETS, Theta or Croston).
State-of-the-art machine learning models: Using state-of-the-art machine learning models (e.g., XGBoost, LightGBM or any scikit-learn model), including efficient feature engineering.
Novel proven neural forecasting models: A large collection of neural forecasting models focused on their usability and robustness. The models range from classic neural networks like MLP, RNNs to novel proven contributions like NBEATS, NHITS, TFT and other architectures. Proven accurate and efficient models focusing on their usability and running in production environment.
Exogenous variables: Enabling the potential of additional information of external indicators like macro-economic indicators, product hierarchies and many more.
Probabilistic forecasting: Since uncertainty is an intrinsic aspect of reality, probabilistic forecasting is the systematic approach quantifying the uncertainty around forecasts by producing the expected range of variation of forecasts.
Local and global models: Local models are specific for single time series while global models enable interdependencies between multiple similar time series.
And much more such as: hierarchical forecasting, long-horizon forecasting, intermittent/sparse series, outlier robustness, interpretable decomposition, cold start problem.
Forecasting the Right Thing
Based on the gained general understanding of forecasting, we now dive deeper into the relationship between sales and demand forecasting business objectives and their implication in providing statistical forecasts.
Forecasting the Right Thing: Demand vs. Sales vs. Market
The first thing we need to clarify is what we want to forecast in business. The following table shows the differences between the closely related fields of sales, demand, and the total market. Demand forecasting relates to the amount of a product or service customers purchase, in a given period, to optimize operational processes. In contrast, sales forecasting focuses on how much a company will sell for revenue projections, financial planning and marketing strategies. Total market forecasting estimates the overall demand for a product or service in the entire market, irrespective of the supplier and is used for understanding market potential, market share analysis and strategic business decisions.
Demand vs. Sales vs. Market

Forecasting the Right Thing: B2B vs. B2C
While both B2B and B2C companies aim to predict demand and sales, the nature of their transactions, the factors they consider and the data they use can vary significantly. Effective forecasting in each sector requires understanding and accounting for these differences, as shown in this table:
B2C vs. B2B

Forecasting the Right Thing: Granularity & Temporality
Granularity is a key factor when talking about time series forecasting. Granularity and temporality define more specifically what we want to forecast.
Granularity, Temporality and Decisions

Granularity refers to the resolution or detail level of the data points related to our product or service. It is crucial as it dictates the level of detail you can observe, potentially impacting the type of analyses and conclusions you can draw from. Typical aspects that drive the choice for granularity (from micro-level to macro-level) are what “material” we want to forecast: e.g., per product, product group, segment or brand. A second aspect is “localization”: for example, per country, region, market or global.
Granularity Based on Material and Localisation

In essence, temporality denotes the existence and understanding of time frequency and time horizon in the data. In which aggregation level are we interested: daily, or weekly, or? The forecasting horizon defines how far we want to predict the future like a 2-month or 4-quarter horizon. Additionally, we elaborate on how frequently we need to perform our forecasts (e.g. once or multiple times a day).
Temporality Based on Frequency and Horizon

Both aspects combined, define the characteristics of time series data. The following figure shows examples of multiple series of historic demand. As we can see, those vary in their structure. The series at the top has a smooth progression, while the time series at the bottom exhibits an irregular progression. Those different statistical characteristics can and must be addressed accordingly with individual analytical methods and models.
Various Time Series and their Individual Characteristics

With a better understanding of the foundations of forecasting as well as how our business requirements interfere with the underlying data, let us come back to how this influences us to provide accurate forecasts.
Time series data can be categorised into statistical groups with their individual characteristics and forecastability. Those groups should be addressed with individual data science techniques like selecting an appropriate model or a suitable evaluation schema. To determine the group respectively the forecastability (for a product), we apply two coefficients:
The Average Demand Interval (ADI): It measures the demand regularity in time by computing the average interval between two demands.
The Square of the Coefficient of Variation (CV²): It measures the variation in quantities.
Time Series Categories Based on ADI vs. CV² [frepple]
![Time Series Categories Based on ADI vs. CV² [frepple]](https://www.datocms-assets.com/88712/1698915407-screenshot-2023-09-21-at-13-11-04.png?auto=format&w=1839)
Based on these two dimensions, the literature classifies the demand profiles into four different categories:
Smooth demand (ADI < 1.32 and CV² < 0.49). The demand is very regular in time and in quantity. It is therefore relatively easy to forecast.
Intermittent demand (ADI >= 1.32 and CV² < 0.49). The demand history shows very little variation in demand quantity but a high variation in the interval between two demands. Though specific forecasting methods tackle intermittent demands, the forecast error margin is considerably higher.
Erratic demand (ADI < 1.32 and CV² >= 0.49). The demand has regular occurrences in time with high quantity variations. Your forecast accuracy remains shaky.
Lumpy demand (ADI >= 1.32 and CV² >= 0.49). The demand is characterised by a large variation in quantity and time. This makes it very hard to provide consistent, reliable forecasts.
How to Measure Forecast Quality (Accuracy)?
Let us finish with some thoughts on how the quality or accuracy of our forecasts can be measured and evaluated. Since there is no one-size-fits-all metric to measure forecast accuracy (or error), we will examine the most important aspects.
Evaluation metrics, also known as performance metrics or accuracy and error metrics are quantitative measurements used to evaluate the performance and quality of a model or algorithm in solving a particular problem. There are a few key points which make metrics in time series forecasting outstanding:
Forecast accuracy (error): Error metrics focus on measuring the accuracy and emphasize the magnitude of errors in the forecasted values compared to the actual values.
Bias: Another key aspect of forecasting is the concept of over- and under-forecasting. We need to be aware of the forecasting model to have structural biases which always over- or under-forecast.
Symmetricity: The error measure should be symmetric to the inputs, for example, forecast and ground truth. Over- and under-forecasting should be equally weighted.
Sensitivity of outliers: Based on our business case and the related data, our metrics need to be robust against outliers (single predictions which are wildly off).
Aggregated metrics: In most business use cases, we would not be forecasting a single time series but rather a set of time series, related or unrelated. Therefore, we want to aggregate by using weights which include additional factors like price and volume.
Temporal relevance: As the name suggests, time series forecasting has the temporal aspect built into it and there are metrics like cumulative forecast error or forecast bias, which takes this temporal aspect as well.
There are several evaluation metrics available because different metrics capture different aspects of model performance. Each metric focuses on a particular characteristic or requirement, allowing us to choose the most suitable metric based on their specific needs and objectives. The table below lists a selection of common error metrics and should give us an intuition of their characteristics. In practice, the aspect of temporal relevance as well as aggregated metrics are addressed by an appropriate evaluation method and “independent” of the chosen metric. One or multiple metrics should be used specific for the requirements of each individual use case, especially as bias is an important factor. Because of the way time series forecasting has evolved (e.g., probabilistic forecasting), there are more ways to assess your performance.
Common Error Metrics and their Characteristics

Evaluation Metric on the Example of Intermittent Demand Forecasting
In short, intermittent demand forecasting deals with spares and single high demand patterns as well as with the fact that over and under planning have a crucial business impact. Which indicator should you use? Unfortunately, there is no definitive answer. Mean Absolute Error (MAE) is somehow sensitive to single high demand patterns and does capture intermittency appropriate but lacks scale independency for multiple products and neglects Bias. Note, that reporting forecast performance with multiple metrics, e.g., using MAE & Bias with normalized data, provides a great way to meet the requirements for a robust evaluation of intermittent demand forecasting.
Summary
In our rapidly evolving business landscape, the precision of sales and demand forecasting is paramount. By harnessing modern techniques in time series forecasting, businesses can navigate through intricate variables, from granularity to temporality, ensuring optimised decision making. Whether it is understanding B2B versus B2C dynamics or selecting the right evaluation metric, modern forecasting methodologies illuminate the path to accurate, data-driven predictions, that are vital for strategic success. Embrace the future with informed foresight.
We Support your Data-Driven Journey
This is the fourth of a series of articles on data-driven decision making, data science, AI, and machine learning. In the previous one, we learned about demand forecasting to optimize supply chain management in logistics from a business perspective. In the next article, we will take a closer look at how to productionise our data science solution and dive into the topic of MLOps.
We would be pleased to discuss your interest in these topics with you in person. Contact the Munich office or write an email to dat@foryouandyourcustomers.com.
foryouandyourcustomers is happy to support your company in becoming more data-driven.