40  GLM. Exponential family

40.1 GLM. Exponential family. Gamma

Consider the Gamma distribution \(Gamma(\mu, \nu)\) with the density

\[f(y) = \frac{1}{\Gamma(\nu)} \left(\frac{\nu y}{\mu} \right)^{\nu} exp\left( - \frac{\nu y}{\mu} \right) \frac{1}{y},\ y>0,\ \nu>0,\ \mu>0\]

Write the density in the standard exponential family form. Identify the canonical parameter \(\theta\), the dispersion parameter \(\phi\), and the functions \(a(\phi)\), \(b(\theta)\) and \(c(y, \phi)\). Calculate \(E[Y]\) and \(Var[Y] = V(\mu) \phi\).

GLM. Exponential family. Gamma. Solution

In a GLM the distribution of the response variable \(Y_i\) comes from a distribution in the exponential family with probability density function

\[f(y_i;\theta_i,\phi)=\exp\left\{{y_i\theta_i -b(\theta_i)\over a_i(\phi)}+c(y_i,\phi)\right\}.\]

Here \(\theta_i\) is an unknown parameter that is a function of the mean, \(\phi\) is a dispersion parameter that may or may not be known, and \(a_i(\phi)\), \(b(\theta_i)\) and \(c(y_i, \phi)\) are known functions.

  • \(\theta_i\) is called canonical parameter and represents the location
  • \(\phi\) is called dispersion parameter and represents the scale

\[f(y) = \frac{1}{\Gamma(\nu)} \left(\frac{\nu y}{\mu} \right)^{\nu} exp\left( - \frac{\nu y}{\mu} \right) \frac{1}{y}\]

\[f(y) = exp\left( \log\left(\frac{1}{\Gamma(\nu)}\right) + \nu \log\left(\frac{\nu y}{\mu} \right) - \frac{\nu y}{\mu} + \log\left( \frac{1}{y}\right) \right)\]

\[exp\left( -\log \Gamma(\nu) + \nu \log (\nu y) - \nu \log (\mu) + \nu y \left(\frac{-1}{\mu}\right) - \log y \right)\]

\[exp\left( \nu y \left(\frac{-1}{\mu}\right) - \nu \log (\mu) + \nu \log (\nu y) - \log y - \log \Gamma(\nu)\right)\]

\[exp\left( \left(y \left(\frac{-1}{\mu}\right) - \log (\mu) \right) \nu + \nu \log (\nu y) - \log y - \log \Gamma(\nu)\right)\]

\(\theta = -1/\mu\)

\(b(\theta) = \log(-1/\theta) = - \log(-\theta)\)

\(a(\phi) = \phi = 1/\nu\)

\(c(y,\phi) = \nu \log (\nu y) - \log (y\Gamma(\nu))\)

\(E[Y] = b'(\theta) = 1/(-\theta) = \mu\)

\(Var[Y] = b''(\theta) a(\phi) = \frac{1}{\theta^2 \nu} = \mu^2/\nu\)

\(V(\mu) = \mu^2\)

40.2 GLM. Exponential family. Bernoulli

The Bernoulli distribution is given by \[p(y;\pi) = \pi^y (1-\pi)^{1-y},\ y \in \{0, 1\}\]

  1. Show that this distribution belongs to the exponential family and identify \(\theta\), \(\phi\), \(b(\theta)\), \(a(\phi)\) and \(c(y, \phi)\).

  2. Use general results about the exponential family to find the expectation and variance function for the distribution.

  3. Find the canonical link for the Bernoulli distribution.

  4. Assume now that \(Y_1, \ldots, Y_n\) are independent variables from a GLM with the Bernoulli distribution as response distribution. Derive the deviance in this case.

GLM. Exponential family. Bernoulli. Solutions

  1. \[\pi^y (1-\pi)^{1-y} = exp\left(y \log \pi + (1-y)\log(1-\pi) \right) = exp\left(y (\log \pi - \log(1-\pi)) + \log(1-\pi) \right)=\] \[= exp\left(y \log(\frac{\pi}{1-\pi}) + \log(1-\pi) \right) = exp\left(\frac{y \log(\frac{\pi}{1-\pi}) - (- \log(1-\pi))}{1} + 0 \right)\]

Canonical parameter is \(\theta = \log (\frac{\pi}{1-\pi})\). Dispersion parameter is \(\phi = 1\). \(a(\phi)=\phi=1\). \(b(\theta) = -\log(1-\pi) = \log(1+exp(\theta))\). \(c(y,\phi) = 0\).

  1. \(E[Y] = b'(\theta) = \pi\). \(Var[Y] = b''(\theta)a(\phi) = \pi (1-\pi)\).

  2. Canonical link is the logit function.

  3. Deviance \[D = 2 \left(l(\boldsymbol{\hat \beta_{sat}}) - l(\boldsymbol{\hat \beta}) \right) \phi\] Bernoulli log-likelihood \[l = \sum_{i=1}^n \left( y_i \log \pi_i + (1- y_i) \log(1-\pi_i) \right)\] Deviance \[D = 2 \sum_{i=1}^n \left(y_i \log(\frac{y_i}{\hat \pi_i}) + (1- y_i) \log(\frac{1-y_i}{1-\hat \pi_i}) \right)\]