# Getting Started

# Intro to Forecasting

Forecasting is quite important, it can be - the volume of goods required, prices of stock, etc.
It can be required several years in advance or a few minutes in advance.
Some things are easier to forecast (time of sun rising) than others (lottery machine)
The predictability of event depends on several factors
- How well we understand the factors that contribute to it
- How much data is available
- How similar the future is to the past
- Whether the forecasts can affect the thing we are trying to forecast (can our forecasts influence the market/situation)
Based on these factors, we see that predicting electricity prices in a city is way more accurate and easy than currency exchange rates for example.
Good forecasts capture the genuine patterns and relationships which exist in the historical data, but do not replicate past events that will not occur again.
Even in a changing environment, a good forecasting model can be developed - it captures the way in which the env is changing.
- What is normally assumed is that the way in which the environment is changing will continue into the future.
Forecasting methods can be simple, such as using the most recent observation as a forecast (which is called the naïve method), or highly complex, such as neural nets and econometric systems of simultaneous equations.

# Forecasting, goals and planning

Forecasting is different from goals and planning
- Forecasting is about predicting the future as accurately as possible, given all of the information available, including historical data and knowledge of any future events that might impact the forecasts.
- Goals are what you would like to happen and are linked to forecasts and plans
- Planning is a response to forecasts and goals and involves determining the appropriate actions that are required to make your forecasts match your goals.
Types of forecasting
- Short term - are needed for the scheduling of personnel, production and transportation.
- Medium term - are needed to determine future resource requirements.
- Long term - are used in strategic planning.

# What to forecast ?

In the early stages of a forecasting project, decisions need to be made about what should be forecast.
It is necessary to consider
- The horizon - will forecasts be required for one month in advance, for 6 months, or for ten years.
- The frequency.
- Talking to domain experts who let you know the needs.

# Forecasting data and methods

Qualitative forecasting - If there are no data available, or if the data available are not relevant to the forecasts. There are well-developed structured approaches and are not guess work.
Quantitative forecasting - If numerical information about the past is available and it is reasonable to assume that some aspects of the past patterns will continue into the future.

# Time series forecasting

Examples include
- Annual Google profits
- Quarterly sales results for Amazon
- Monthly rainfall
- Weekly retail sales
Anything that is observed sequentially over time is a time series.
When forecasting time series data, the aim is to estimate how the sequence of observations will continue into the future.
For example
- The dark shaded region shows 80% prediction intervals. That is, each future value is expected to lie in the dark shaded region with a probability of 80%.
- The light shaded region shows 95% prediction intervals.
- These prediction intervals are a useful way of displaying the uncertainty in forecasts.
Decomposition methods are helpful for studying the trend and seasonal patterns in a time series

# Predictor variables and time series forecasting

Suppose we wish to forecast the hourly electricity demand (ED).
- A model (explanatory model) with predictor variables is of the form $$\begin{align*} \text = & f(\text{current temperature, strength of economy, population,}\ & \qquad\text{time of day, day of week, error}). \end{align*}$$
  - The “error” term on the right allows for random variation and the effects of relevant variables that are not included in the model (external factors that are not included in predictor variables)
- A time series model if of the form \text{ED}_{t+1} = f(\text{ED}_{t}, \text{ED}_{t-1}, \text{ED}_{t-2}, \text{ED}_{t-3},\dots, \text{error}),
  - Here, prediction of the future is based on past values of a variable, but not on external variables that may affect the system
- A third type of model that combines both features is of the form \text{ED}_{t+1} = f(\text{ED}_{t}, \text{current temperature, time of day, day of week, error}).
  - They are known as dynamic regression models, panel data models, longitudinal models, transfer function models, and linear system models.
  - An explanatory model is useful because it incorporates information about other variables, rather than only historical values of the variable to be forecast.
A forecaster might want to use only a time series model for the reasons -
- System may not be understood and even if it is, it might be difficult to measure relationships that govern its behaviour.
- It is necessary to know or forecast the future values of the various predictors in order to be able to forecast the variable of interest.
- Interpretability is not a concern or not more important than prediction.
- Might give better forecasts.

# Steps in forecasting

Problem definition
- Defining the problem carefully requires an understanding of the way the forecasts will be used, who requires the forecasts, and how the forecasting function fits within the organisation requiring the forecasts.
Gathering information
- Two types of information are required
  - Statistical data
  - Accumulated expertise of people who collect data and use forecasts
Preliminary (exploratory) analysis
- Plot the data and check for patterns, trends, seasonalities, outliers, correlations, etc.
Choosing and fitting models
- The best model to use depends on the availability of historical data, the strength of relationships between the forecast variable and any explanatory variables, and the way in which the forecasts are to be used.
Using and evaluating a forecasting model
- The performance of the model can only be properly evaluated after the data for the forecast period have become available.

# The statistical forecasting perspective

The thing we are trying to forecast is unknown, and so we can think of it as a random variable.
In most forecasting situations, the variation associated with the thing we are forecasting will shrink as the event approaches.
For example
- - The lines show possible values in the future based on several models.
  - When we obtain a forecast, we are estimating the middle of the range of possible values the random variable could take.
  - Often, a forecast is accompanied by a prediction interval giving a range of values the random variable could take with relatively high probability.
- - Rather than plotting individual possible futures, we usually show these prediction intervals instead.
  - The blue line is the average of the possible future values, which we call the point forecasts
Formulation
- We denote all the information observed as I and we want to forecast y_t
- We can write y_t | I. The set of values that this random variable could take, along with their relative probabilities, is known as the “probability distribution” of y_t | I. In forecasting, we call this the forecast distribution.
- We write the forecast of y_t as \hat y meaning the average of the possible values that y_t could take given everything we know
- y_{T + h | T} means the forecast of y_{T + h} taking account of y_1, \cdots, y_T with a h step forecast