6 Bayesian Linear Regression
This chapter formalizes a complete Bayesian linear regression example following the Isabelle AFP development by Hirata, Minamide, and Sato. The model is \(y = s \cdot x + b + \varepsilon \) with \(\varepsilon \sim \mathcal{N}(0, 1/2)\), prior \(s, b \sim \mathrm{iid}\; \mathcal{N}(0, 3)\), and data \((1,2.5), (2,3.8), (3,4.5), (4,6.2), (5,8.0)\).
6.1 Priors and Likelihood
\(\int _{\mathcal{N}(m, \sigma )} f = \int _{\mathrm{Lebesgue}} f(x) \cdot \mathrm{normal\_ pdf}(m, \sigma , x)\, dx\), converting QBS integration to Lebesgue integration with density.
\(\mathrm{slope\_ prior} = \mathcal{N}(0, 3)\) and \(\mathrm{intercept\_ prior} = \mathcal{N}(0, 3)\).
The observation likelihood for parameters \((s, b)\):
where \((y_1, \ldots , y_5) = (2.5, 3.8, 4.5, 6.2, 8.0)\).
\(\mathrm{obs}(s, b) \geq 0\) for all \((s, b)\).
6.2 Evidence (Normalizing Constant)
\(Z = \int \! \! \int \mathrm{obs}(s, b)\, d\pi (s)\, d\pi (b)\) where \(\pi = \mathcal{N}(0, 3)\).
\(0 \leq Z\).
6.3 Posterior Density
\(\mathbb {E}_{\mathrm{post}}[g] = \frac{\mathbb {E}_{\mathrm{prior}}[g \cdot \mathrm{obs}]}{Z}\).
When \(Z {\gt} 0\) and \(Z {\lt} +\infty \), the posterior integrates to \(1\).
6.4 Monadic Prior
The joint prior on \((s, b)\) is constructed via nested bind: \(\mathrm{prior} = \mathrm{slope\_ prior} \mathbin {{\gt}\! \! {\gt}\! \! =} (\lambda s.\; \mathrm{str}(s, \mathrm{intercept\_ prior}))\).
Each single-observation likelihood factor is a QBS morphism.
Each single-observation likelihood is a strong morphism.
6.5 Normalizer Program
\(\mathrm{norm\_ qbs}(g, \mathrm{obs})\) returns \(\mathrm{Some}(d)\) if the evidence is positive and finite, \(\mathrm{None}\) otherwise.
\(\mathrm{program} = \mathrm{norm\_ qbs}(\lambda \_ .\; 1, \mathrm{obs})\).
6.6 Evidence Computation
The result of integrating \(\mathrm{obs}(s, b)\) over \(b\) against \(\mathcal{N}(0, 3)\): a scalar depending only on \(s\).
\(\int _{\mathcal{N}(0,3)} \mathrm{obs}(s, b)\, db = \mathrm{scalar\_ of\_ s}(s)\).
\(\int _{\mathcal{N}(0,3)} \mathrm{scalar\_ of\_ s}(s)\, ds = \mathrm{phase2\_ const}\).
\(Z = \mathrm{phase2\_ const}\), an explicit closed-form constant computed by iterating the product-of-Gaussians identity through 10 combination steps.
\(0 {\lt} Z\) and \(Z {\lt} +\infty \).
6.7 Main Results
The program returns \(\mathrm{Some}(d)\) (evidence is positive and finite).
The posterior distribution integrates to \(1\) (unconditionally). This is the main theorem of the Bayesian regression formalization.
Derive positivity and finiteness of evidence from evidence_value and phase2_const_gt0, then apply posterior_density_total.
The observation likelihood is integrable against the joint prior, established by bounding \(\mathrm{obs}\) by the normal peak raised to the 5th power.
The posterior measure is characterized by its density against the prior.