Methods
Statistical Framework
The model integrates polling data \(y_{n,p}\) and fundamentals data through a hierarchical structure:
\[ \begin{aligned} y_{n} &\sim \text{Dirichlet-Multinomial}(\pi_{n}, \phi_{h[n]}) \\ \pi_{n} &= \text{softmax}(\eta_{n}) \\ \eta_{n,p} &= \beta_{p,t[n]} + \gamma_{p,h[n]} \end{aligned} \]
where \(\phi_{h[n]}\) is house-specific overdispersion, \(t[n]\) indexes the time of poll \(n\), and \(h[n]\) indexes the polling house.
The fundamentals component predicts election results through:
\[ \begin{aligned} y^{(f)}_{d} &\sim \text{Dirichlet-Multinomial}(\pi^{(f)}_{d}, \phi_f) \\ \pi^{(f)}_{d} &= \text{softmax}(\mu_{d}) \\ \mu_{d,p} &= \alpha_p + \beta_{\text{lag}}x_{p,d} + \beta_{\text{years}}\log(I_{p,d}) + \beta_{\text{vnv}}v_{p,d} + \beta_{\text{growth}}g_{p,d} \end{aligned} \]
The connection between these two components is the election day prediction \(\mu_{\text{pred}}\), calculated from the fundamentals model using current economic conditions and previous election results. This prediction serves as a prior for election day support:
\[ \beta_{T} \sim \mathcal{N}(\mu_{\text{pred}}, \tau_f\cdot \sigma) \]
The relative weight between polling and fundamentals is controlled by \(\tau_f\), computed to give the fundamentals component a specified percentage weight in the prediction at a certain time before the election (see Appendix for details).
Polling Component
The polling model tracks latent party support \(\beta_{p,t}\) over time using a multivariate random walk with correlation structure \(\Omega\) between parties:
\[ \beta_{t} = \begin{cases} \mathrm{Normal}\left(\beta_{t+1}, (1 + \tau_s)\sqrt{\Delta_t} \boldsymbol \Sigma\right) & \text{after government split} \\ \mathrm{Normal}\left(\beta_{t+1}, \sqrt{\Delta_t} \boldsymbol \Sigma\right) & \text{otherwise} \end{cases} \]
where:
- \(\boldsymbol \Sigma = \text{diag}(\sigma_1,\ldots,\sigma_P) \Omega \text{diag}(\sigma_1,\ldots,\sigma_P)\) is the covariance matrix
- \(\sigma_p\) captures party-specific volatility scales
- \(\Omega\) is the correlation matrix between party-specific innovations
- \(\tau_s\) allows for increased volatility after government splits
- \(\Delta_t\) is the number of days between polls
Fundamentals Component
The fundamentals model predicts party vote shares using economic and political variables:
\[ \pi = \text{softmax}(\alpha + \beta_\text{lag}x + \beta_\text{inc}\log(I) + \beta_\text{infl}V + \beta_\text{growth}G) \]
where:
- \(\alpha_p\) are party-specific intercepts that sum to zero (\(\sum_p \alpha_p = 0\))
- \(x\) are previous election vote shares
- \(I\) are years in government for incumbent parties
- \(V\) is inflation on an annual basis six months before the election
- \(G\) is economic growth on an annual basis six months before the election
- \(\beta_\text{lag}\) captures persistence in party support
- \(\beta_\text{inc}\) measures the effect of time spent in government on incumbent parties
- \(\beta_\text{infl}\) captures the impact of inflation on incumbent parties
- \(\beta_\text{growth}\) captures the impact of growth on incumbent parties
Model Integration
The fundamentals prediction serves as a prior for the election day vote shares \(\beta_T\):
\[ \beta_T \sim \mathcal{N}(\mu_\text{pred}, \tau_f \cdot \sigma) \]
where \(\mu_\text{pred}\) is the fundamentals prediction and \(\tau_f\) controls how much weight is given to the fundamentals versus polling data at some point before the elections.
Choosing \(\tau_f\)
To choose how we want to calculate the standard deviation for prior on \(\beta_T\), we can frame our model as a Gaussian-Gaussian conjugate problem where:
- The prior (fundamentals prediction) is: \(\beta_T \sim \mathcal{N}(\mu_{\text{pred}}, \tau_f \cdot \sigma)\)
- The likelihood (polling prediction from time t) is: \(\beta_T \sim \mathcal{N}(\beta_t, V(t) \cdot \sigma)\)
where \(V(t)\) represents the accumulated variance from time t to election time T:
- For \(t \leq 47\) (after government split): \(V(t) = t \cdot (1 + \tau_{\text{stjornarslit}})^2\)
- For t > 47 (before government split): \(V(t) = (t - 47) + 47 \cdot (1 + \tau_{\text{stjornarslit}})^2\)
Using standard Gaussian-Gaussian conjugate formulas:
\[ \begin{aligned} \text{Posterior precision} &= \frac{1}{(\tau_f \cdot \sigma)^2} + \frac{1}{V(t) \cdot \sigma^2} \\ &= \left(\frac{1}{\tau_f^2} + \frac{1}{V(t)}\right) \cdot \frac{1}{\sigma^2} \end{aligned} \]
For a desired fundamentals weight w at time t:
\[ w = \frac{1/\tau_f^2}{1/\tau_f^2 + 1/V(t)} \]
Solving for \(\tau_f\):
\[ \tau_f = \sqrt{V(t) \cdot (1-w)/w} \]
where \(V(t)\) depends on \(\tau_{\text{stjornarslit}}\) as defined above. In our current modeling, we choose \(w = \frac13\) and \(t = 180\).