{ "cells": [ { "cell_type": "markdown", "id": "f029aeea", "metadata": {}, "source": [ "# Forecasting Inflation rates with SARIMA \n", "\n", "\n", "**SARIMA** is an extension of the regular ARIMA model that adds a seasonality component to the model. This allows us to better capture seasonal affects that the regular ARIMA model does not permit.\n", "To produce SARIMA(p, d, q)(P, D, Q)m, a seasonality component is added to each factor of the classic ARIMA equation \n", "\n", "$$\n", "y_t' = c + \\sum_{n=1}^{p} \\phi_n y_{t-n}' + \\sum_{n=1}^{q} \\theta_n \\varepsilon_{t-n} \n", "+ \\sum_{n=1}^{P} \\eta_n y_{t-mn}' + \\sum_{n=1}^{Q} \\omega_n \\varepsilon_{t-mn} + \\varepsilon_t\n", "$$\n", " Where:\n", "\n", "y’: differenced time series, through both regular, d, and seasonal, D, differencing\n", "P: number of seasonal auto-regressors\n", "ω: coefficients of the seasonal autoregressive components\n", "Q: number of seasonal moving-average components\n", "η: coefficients of the seasonal forecast errors\n", "m: length of season\n", "\n", "\n", "### Limitation of SARIMA\n", "\n", "One key limitation of the **SARIMA** model is that it **cannot handle multiple seasonalities**. \n", "It is designed to capture only a **single seasonal pattern** (e.g., monthly or weekly), making it unsuitable for datasets that exhibit **more than one seasonal cycle** (e.g., both daily and yearly patterns).\n", "\n", "\n", "\n", "\n" ] }, { "cell_type": "markdown", "id": "87ea7b85", "metadata": {}, "source": [ "## Key Considerations\n", "\n", "### 1. Stationarity Requirement\n", "\n", "SARIMA models require the time series to be **stationary**, meaning:\n", "- No long-term trend or persistent seasonality\n", "- Constant mean and variance over time\n", "\n", "To achieve stationarity:\n", "- **Mean stabilization** is done through differencing:\n", " - Regular differencing → order \\( d \\)\n", " - Seasonal differencing → order \\( D \\)\n", "- **Variance stabilization** can be achieved using:\n", " - Log transformation\n", " - Box-Cox transformation \n", "These help normalize the amplitude of seasonal fluctuations across time.\n", "\n", "---\n", "\n", "### 2. Order Selection\n", "\n", "Once the series is stationary, we determine the model orders:\n", "\n", "#### a. Differencing Orders \n", "- \\( d \\): Number of regular differences \n", "- \\( D \\): Number of seasonal differences \n", "Use the **Augmented Dickey-Fuller (ADF)** test to assess whether differencing is required.\n", "\n", "#### b. AR and MA Orders \n", "- Regular terms: \\( p \\) (AR), \\( q \\) (MA) \n", "- Seasonal terms: \\( P \\) (SAR), \\( Q \\) (SMA) \n", "Analyze:\n", "- **Partial Autocorrelation Function (PACF)** → identifies AR (p, P) \n", "- **Autocorrelation Function (ACF)** → identifies MA (q, Q) \n", "\n", "\n", "\n", "---\n", "\n" ] }, { "cell_type": "markdown", "id": "7eb25e04", "metadata": {}, "source": [ "# `Data Loading and Visualization`" ] }, { "cell_type": "code", "execution_count": 1, "id": "f19ee008", "metadata": {}, "outputs": [], "source": [ "\n", "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 2, "id": "59d09afb", "metadata": {}, "outputs": [], "source": [ "infl_df = pd.read_csv('I:/CQAI/TSA/Notebooks/inflation rates dataset.csv')" ] }, { "cell_type": "code", "execution_count": 3, "id": "bfeb00f9", "metadata": {}, "outputs": [], "source": [ "# infl_df[\"Date\"] =pd.to_datetime(infl_df['Year'].astype(str) + '-' + infl_df['Month'].astype(str) + '-01')\n", "infl_df[\"Date\"]= pd.to_datetime(infl_df[\"Date\"])\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": 4, "id": "27f5528e", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | Unnamed: 0 | \n", "Monthly_Inflation | \n", "
---|---|---|
Date | \n", "\n", " | \n", " |
1960-01-01 | \n", "0 | \n", "NaN | \n", "
1960-02-01 | \n", "1 | \n", "1.000000 | \n", "
1960-03-01 | \n", "2 | \n", "0.991323 | \n", "
1960-04-01 | \n", "3 | \n", "1.008753 | \n", "
1960-05-01 | \n", "4 | \n", "1.006508 | \n", "
... | \n", "... | \n", "... | \n", "
2019-09-01 | \n", "716 | \n", "1.006250 | \n", "
2019-10-01 | \n", "717 | \n", "1.009317 | \n", "
2019-11-01 | \n", "718 | \n", "1.009231 | \n", "
2019-12-01 | \n", "719 | \n", "1.006098 | \n", "
2020-01-01 | \n", "720 | \n", "NaN | \n", "
721 rows × 2 columns
\n", "