Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Deviance of Binomial generalized linear model

  1. Sep 18, 2010 #1
    The formula for the deviance of a binomial generalized linear model is:
    [tex]D = 2\sum[y_i \log(\frac{y_i}{\hat{y}_i})+(n_i-y_i)\log(\frac{n_i-y_i}{n_i-\hat{y}_i})][/tex].

    where the responses y are [tex]Binomial(n_i, p_i)[/tex], and [tex]\hat{y}_i = n_i\hat{p}_i[/tex].

    The second log in that equation is undefined when [tex]n_i=y_i[/tex], which of course can happen with non-zero probability.

    In R, that formula correctly gives the deviance that R gives. So what happens to the deviance when the binomial glm model has a data point where [tex]n_i=y_i[/tex]? Somehow R is still able to give a finite deviance, in this situation, even though the formula fails.

    Also (this is a separate question), in order to the calculate the deviance, you need to calculate likelihood function of the saturated model. The likelihood function is [tex]L(\texbf{\mu},\texbf{y})[/tex] (mu is the vector of expected value of the y's), and the likelihood function for the saturated model is found by replacing [tex]\mu[/tex] with y (i.e all variation explained, perfect fit, but why?). Since the saturated model is defined as the model whose number of parameters equals the number of observations, how does the above fact follow from the definition? Also what does the definition even mean? Where in the model equation would you stick the extra parameters, and what are their associated covariates, so that the number of parameter equals the number of observations?
     
  2. jcsd
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook

Can you offer guidance or do you also need help?
Draft saved Draft deleted