Bayesian Unobserved Component Models
Abstract. We present the basics of Bayesian estimation and inference for unobserved component models on the example of a local-level model. The range of topics includes the conjugate prior analysis using normal-inverted-gamma 2 distribution and its extensions focusing on hierarchical modelling, conditional heteroskedasticity, and Student-t error terms. We scrutinise Bayesian forecasting and sampling from the predictive density.
Keywords. Unobserved Component Models, Local-Level Model, State-Space Bayesian Inference, Forecasting, Heteroskedasticity, Hierarchical Modelling, Gibbs Sampler, Simulation Smoother, Precision Sampling
Unobserved component models
Unobserved Component (UC) models are a popular class of models in macroeconometrics that use the state-space representation for unit-root nonstationary time series. The simple formulation of the model equations decomposing the series into a non-stationary and stationary component facilitates economic interpretations and good forecasting performance.
A simple local-level model
The model is set for a univariate time series whose observation at time \(t\) is denoted by \(y_t\). It decomposes the variable into a stochastic trend component, \(\tau_t\), and a stationary error component, \(\epsilon_t\). The former follows a Gaussian random walk process with the conditional variance \(\sigma_\eta^2\), and the latter is zero-mean normally distributed with the variance \(\sigma^2\). These are expressed as the model equations: \[\begin{align} y_t &= \tau_t + \epsilon_t,\\ \tau_t &= \tau_{t-1} + \eta_t,\\ \epsilon_t &\sim\mathcal{N}\left(0, \sigma^2\right),\\ \eta_t &\sim\mathcal{N}\left(0, \sigma_\eta^2\right), \end{align}\] where the initial condition \(\tau_0\) is a parameter of the model.
Matrix notation for the model
To simplify the notation and the derivations introduce matrix notation for the model. Let \(T\) be the available sample size for the variable \(y\). Define a \(T\)-vector of zeros, \(\mathbf{0}_T\), and of ones, \(\boldsymbol\imath_T\), the identity matrix of order \(T\), \(\mathbf{I}_T\), as well as \(T\times1\) vectors: \[\begin{align} \mathbf{y} = \begin{bmatrix} y_1\\ \vdots\\ y_T \end{bmatrix},\quad \boldsymbol\tau = \begin{bmatrix} \tau_1\\ \vdots\\ \tau_T \end{bmatrix},\quad \boldsymbol\epsilon = \begin{bmatrix} \epsilon_1\\ \vdots\\ \epsilon_T \end{bmatrix},\quad \boldsymbol\eta = \begin{bmatrix} \eta_1\\ \vdots\\ \eta_T \end{bmatrix},\qquad \mathbf{i} = \begin{bmatrix} 1\\0\\ \vdots\\ 0 \end{bmatrix}, \end{align}\] and a \(T\times T\) matrix \(\mathbf{H}\) with the elements: \[\begin{align} \mathbf{H} = \begin{bmatrix} 1 & 0 & \cdots & 0 & 0\\ -1 & 1 & \cdots & 0 & 0\\ 0 & -1 & \cdots & 0 & 0\\ \vdots & \vdots & \ddots & \vdots & \vdots\\ 0 & 0 & \cdots & 1 & 0\\ 0 & 0 & \cdots & -1 & 1 \end{bmatrix}. \end{align}\]
Then the model can be written in a concise notation as: \[\begin{align} \mathbf{y} &= \mathbf{\tau} + \boldsymbol\epsilon,\\ \mathbf{H}\boldsymbol\tau &= \mathbf{i} \tau_0 + \boldsymbol\eta,\\ \boldsymbol\epsilon &\sim\mathcal{N}\left(\mathbf{0}_T, \sigma^2\mathbf{I}_T\right),\\ \boldsymbol\eta &\sim\mathcal{N}\left(\mathbf{0}_T, \sigma_\eta^2\mathbf{I}_T\right). \end{align}\]
Likelihood function
The model equations imply the predictive density of the data vector \(\mathbf{y}\). To see this, consider the model equation as a linear transformation of a normal vector \(\boldsymbol\epsilon\). Therefore, the data vector follows a multivariate normal distribution given by: \[\begin{align} \mathbf{y}\mid \boldsymbol\tau, \sigma^2 &\sim\mathcal{N}_T\left(\boldsymbol\tau, \sigma^2\mathbf{I}_T\right). \end{align}\]
This distribution determines the shape of the likelihood function that is defined as the sampling data density: \[\begin{align} L(\boldsymbol\tau,\sigma^2|\mathbf{y})\equiv p\left(\mathbf{y}\mid\boldsymbol\tau, \sigma^2 \right). \end{align}\]
The likelihood function that for the sake of the estimation of the parameters, and after plugging in data in place of \(\mathbf{y}\), is considered a function of parameters \(\boldsymbol\tau\) and \(\sigma^2\) is given by: \[\begin{align} L(\boldsymbol\tau,\sigma^2|\mathbf{y}) = (2\pi)^{-\frac{T}{2}}\left(\sigma^2\right)^{-\frac{T}{2}}\exp\left\{-\frac{1}{2}\frac{1}{\sigma^2}(\mathbf{y} - \boldsymbol\tau)'(\mathbf{y} - \boldsymbol\tau)\right\}. \end{align}\]
Prior distributions
Bayesian estimation
Gibbs sampler
Simulation smoother and precision sampler
Analytical solution for a joint posterior
Hierarchical modeling
Estimating gamma error term variance prior scale
Estimating inverted-gamma 2 error term variance prior scale
Estimating the initial condition prior scale
Student-t prior for the trend component
Estimating Student-t degrees of freedom parameter
Laplace prior for the trend component
Model extensions
Autoregressive cycle component
Random walk with time-varying drift parameter
Student-t error terms
Conditional heteroskedasticity
Bayesian forecasting
Predictive density
Sampling from the predictive density
Missing observations
References
Citation
@online{woźniak2024,
author = {Woźniak, Tomasz},
title = {Bayesian {Unobserved} {Component} {Models}},
date = {2024-05-01},
url = {https://donotdespair.github.io/Bayesian-Unobserved-Component-Models/},
doi = {10.26188/25814617},
langid = {en}
}