Posterior predictive checks

class: center, middle, inverse, title-slide

# Posterior predictive checks
### Math 315: Bayesian Statistics

---

# Posterior predictive checks

.large[
- Verifying a model your adequate is a fundamental step in the modeling cycle

- The usual residual checks from Math 245 are appropriate here: Q-Q plots; residual plots; added variable plots; etc.

- A uniquely Bayesian diagnostic is the posterior predictive
check

- If the model fits, then data generated by the model should look similar to the observed data
]

---

# Review: Posterior predictive distributions

.large[
- Plug-in approach suppresses uncertainty in `$\theta$`

+ `$Y_{\rm new} \sim f(y | \widehat{\theta})$`
    
- PPD propagates the uncertainty in `$\theta$` through the predictions

+ `$f(Y_{\rm new} | \boldsymbol{Y}) = \int f(Y_{\rm new}, \theta | \boldsymbol{y}) d\theta = \int f(Y_{\rm new}| \theta) f(\theta | \boldsymbol{y}) d\theta$`
    
    + To make `$S$` draws from the PPD, for each of the `$S$` MCMC
draws of `$\theta$` we draw a `$Y_{\rm new}$`
]

---

# Posterior predictive checks

.large[
-  Sample many data sets from the
PPD with the identical design (same `$n$`, same `$\boldsymbol{X}$`) as the
original data set

- Define a statistic describing the data set

+ e.g. `$D(\boldsymbol{Y}) = \max\{Y_1, \ldots, Y_n\}$`
    
- `$D_0 =$` statistic for the original sample; <br>
`$D_s=$` the statistic from simulated data set number `$s$`

- If the model is appropriate, `$D_0$` should fall in the middle of `$D_1, \ldots, D_s$`
]

---

# Posterior predictive checks

.large[
- We need a measure of how extreme the observed data is relative to
this sampling distribution

- **Bayesian p-value**
`$$p=\frac{1}{S} \sum_{s=1}^S I(D_s > D_0)$$`

- If `$p$` is near 0 or 1 the model doesn't fit well

+ Major failures of the model = extreme tail areas <br> 
    (less that 0.01 or more than 0.99)
    + 0.05 to 0.95 is "a reasonable range" (Gelman et al, 2013)

- Check several test statistics to evaluate the model fit
]

.footnotesize[
Gelman,  et al (2013). *Bayesian Data Analysis*. 3rd ed.
]

---
class: clear

.large[
.bold[Possible test statistics]:

- min, max, SD of `$Y_1, \ldots, Y_n$`

- ratio if SDs between groups

- Let plots of the data guide the exploration...
]

---

```r
ms <- "model{
  for(i in 1:n){
    Y[i] ~ dnorm(mu[i], tau)
    mu[i] <- inprod(X[i,], beta[])
  }
  
  for(j in 1:4) {beta[j] ~ dnorm(0,0.001)}
  tau ~ dgamma(0.1, 0.1)
  
* #PPD checks
* for(i in 1:n) {
*   Yp[i]  ~ dnorm(mu[i],tau)
* }
* D[1]  <- sd(Yp[])
* D[2]  <- min(Yp[])
* D[3]  <- max(Yp[])
}"

dat <- list(Y = Y, X = X, n = nrow(X))

model <- jags.model(textConnection(ms), data = dat, quiet = TRUE)

update(model, 1000, progress.bar="none")

draws <- coda.samples(model, variable.names = c("D", "Yp"), 
                      n.iter = 1000, progress.bar = "none")
```

---

# Posterior predictive checks

```r
d1_obs <- sd(Y)
d1_ppd <- draws[[1]][,"D[1]"]

# p-value
mean(d1_ppd > d1_obs)
```

```
## [1] 0.669
```

---

# Posterior predictive checks

```r
d2_obs <- min(Y)
d2_ppd <- draws[[1]][,"D[2]"]

# p-value
mean(d2_ppd > d2_obs)
```

```
## [1] 0.043
```

---

# Posterior predictive checks

```r
d3_obs <- max(Y)
d3_ppd <- draws[[1]][,"D[3]"]

# p-value
mean(d3_ppd > d3_obs)
```

```
## [1] 0.313
```

---

# Posterior predictive checks

.large[
`$D_4(\boldsymbol{Y}) = s_{\rm smoke} / s_{\rm nonsmoke}$`

It's easier to calculate this in R than in JAGS
]

```r
# Group IDs
smoke.id <- unname(which(X[,"smoke"] == 1))
nonsmoke.id <- unname(which(X[,"smoke"] == 0))

# Extract PPD for each observation
y_ppd_draws <- draws[[1]][,-c(1:3)]

# Calculate observed D4
d4_obs <- sd(Y[smoke.id]) / sd(Y[nonsmoke.id])

# Calculate D4
d4_ppd <- numeric(nrow(y_ppd_draws))
for(i in 1:length(d4_ppd)) {
  y_sim <- y_ppd_draws[i,]
  d4_ppd[i] <- sd(y_sim[smoke.id]) / sd(y_sim[nonsmoke.id])
}
```

---

# Posterior predictive checks

`$D_4(\boldsymbol{Y}) = s_{\rm smoke} / s_{\rm nonsmoke}$`

```r
# p-value
mean(d4_ppd > d4_obs)
```

```
## [1] 0.125
```

---
class: inverse, middle, center

# Should you be Bayesian or a Frequentist?

---
class: inverse

# Your turn

.Large[
- Break into 4 groups of 5

- Two groups will devise an argument for the Frequentist paradigm

- Two groups will devise an argument for the Bayesian paradigm

- You'll have 10 minutes to develop your arguments
]