Bayesian inference using INLA

Author

Saavedra, P.

Mostrar/Ocultar código R
library(readxl)
library(INLA)
library(quantreg)
library(lme4)
library(tidyverse)
library(flextable)
library(plot3D)
library(epibasix)
library(janitor)
library(lubridate)
library(knitr)
library(kableExtra)
inla.setOption(short.summary=TRUE)

 

Course materials

The data files and R code used in this course can be downloaded from this link.

 

 

1 A simple linear regression model

For a dataset \({(x_i,y_i) : i = 1,...,n}\), we consider the linear regression model: \(y_{i}\sim N\left(\mu_{i},\sigma\right)\), being

\[\mu_i = \beta_0 + \beta_1.x_i\] We estimate this model in the bayesian framework supposing that \(\beta_{0}\sim N\left(0;10^{4}\right)\) and \(\beta_{1}\sim N\left(3,1\right)\)

First, we simulate a dataset \(\left\{ \left(x_{i},y_{i}\right)~:~i=1,...,n=50\right\}\) obeying the linear regression model: \[y_{i}\sim N\left(5+x_{i}~;~\sigma=1\right)\] Note \(\beta_0=5\) and \(\beta_1=1\).

 

Data simulation:

Mostrar/Ocultar código R
### Data simulation
set.seed(19941998)
b <- c(5,1)
sg=1
n=50
x <- seq(0,10,length=50)
mu <- b[1]+b[2]*x
y <- mu+rnorm(n,0,sg)
dt <- data.frame(x,y)
### Figure
ggplot(dt,aes(x,y)) + geom_point()

Results of frequentist inference are:

Mostrar/Ocultar código R
### Frequentist Inference
ml <- lm(y ~x)
summary(ml)

Call:
lm(formula = y ~ x)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.3367 -0.6464 -0.1080  0.6322  2.4538 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  4.66664    0.28812   16.20   <2e-16 ***
x            1.06518    0.04965   21.45   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.034 on 48 degrees of freedom
Multiple R-squared:  0.9056,    Adjusted R-squared:  0.9036 
F-statistic: 460.3 on 1 and 48 DF,  p-value: < 2.2e-16

 

And results of bayesian inference with vague priors:

Mostrar/Ocultar código R
#### Bayesian inference
formula <- y ~ 1 + x
ml <- inla(formula,family="gaussian",data=dt)
ml$summary.fixed[,1:5]
                mean         sd 0.025quant 0.5quant 0.975quant
(Intercept) 4.666657 0.28747239  4.1006151 4.666656   5.232700
x           1.065182 0.04953955  0.9676367 1.065182   1.162727

 

Erratic prior.

If we impose on the Bayesian estimation that we are very sure that the value of the slope is 1.5, we arrive at the following result:

Mostrar/Ocultar código R
#### Erratic priors
mb <- inla(formula,family="gaussian",data=dt,
           control.fixed=list(mean=1.5, prec=10000,
                              mean.intercept=1, prec.intercept=0.0001))
mb$summary.fixed[,1:5]
                mean        sd 0.025quant 0.5quant 0.975quant
(Intercept) 2.528839 0.2358034   2.062863 2.529498   2.991121
x           1.492742 0.0100154   1.473102 1.492742   1.512385

 

Next we show this line as well as the true regression line and the one fitted by frequentist estimation. We can see how our initial prior has had a great effect.

Mostrar/Ocultar código R
### Figure
Bb_0 <- mb$summary.fixed[1,1]
Bb_1 <- mb$summary.fixed[2,1]
ggplot(dt,aes(x,y))+
  geom_point(colour="#99CCFF")+
  geom_abline(intercept = 5, slope = 1,colour="#3300FF",linewidth=1.5)+
  annotate(geom="text", x=2.5, y=16, label="True regression line",colour="#3300FF",size=5)+
  stat_smooth(method = "lm",formula = y ~ x,geom = "smooth",colour="#FF0000")+
  annotate(geom="text", x=2.5, y=15, label="Frequentist estimation",colour="#FF0000",size=5)+
  geom_abline(intercept = Bb_0, slope = Bb_1,colour="#33CC99")+
  annotate(geom="text", x=2.5, y=14, label="Bayesian estimation",colour="#33CC99",size=5)

 

We can also see the posterior density resulting from our insistence that the slope has to be very close to 1.5:

Mostrar/Ocultar código R
#### Densidades a posteriori
dpost.intercept <- data.frame(mb$marginals.fixed$`(Intercept)`)
dpost.x <- data.frame(mb$marginals.fixed$x)

ggplot(dpost.x,aes(x,y))+
  geom_line(colour="#33CC99",linewidth=1.5)+
  geom_vline(xintercept=1,colour="#3300FF")

 

A closer look at the posterior density:

Mostrar/Ocultar código R
ggplot(dpost.x,aes(x,y))+
  geom_line(colour="blue",size=1)+
  theme_bw()