Simulated data with known parameters to recapture

Create simulated data to recapture regression parameters. The code to simulate the data can be found on the website from 1 Nov 2017 and is copied here:

nReps<-10
x<-seq(from=0,to=100,by=1) # x = height
x<-rep(x,nReps)
b0<-1.0; b1<-0.2 # beta0 = slope, b1 = inercept
ymean<-b0+b1*x # establish a mean for weight model using height
y<-rnorm(n=length(ymean),mean=ymean,sd=1.5) # simulate random normal dist of weight - from height
N<-length(y)
wtData<-data.frame(weight_lbs=y, height_in=x) # create a dataframe to enter into model using simulated data params above

Fit a linear regression predicting weight from height using your own MCMC sampler.
Use a Normal(0,100) prior for both the intercept and slope and a InverseGamma(0.001,0.001) prior for the standard deviation. (Install the invgamma package to access this distribution.) Generate 10,000 samples for each of the model parameters.

Metropolis Algorithym

  1. Intialize the chain at some initial value of p

  2. Generate a proposed value of p

  3. Calculate acceptance probability r

  4. Accept proposed value of p with probability r

  5. Repeat n times

Data Likelihood

Write a log likelihood function.

dataLik<-function(b0,b1,sd,data){
    wt<-y
    ht<-x
    wtPredicted<-b0+b1*x
    llik<- sum(dnorm(x=ht, mean=wtPredicted,sd=sd,log=TRUE)) # sum because it is logs
    return(llik)
  }

Normal Prior on Beta

Write a function for beta, which can be called later for either B0 or B1.

normPriorL<-function(beta,priorMean,priorSD){
    lik<-dnorm(x = beta, mean = priorMean, sd = priorSD,log=TRUE)
    return(lik)
  }

Inverse Gamma prior

Use inverse gamma for a prior for variance (sd). Jim Clark and others indicate that the inverse gamma plays nice with variance

invGamma<-function(sd,alphaPrior,betaPrior){
    require(invgamma)
    variance<-sd^2
    lik<-dinvgamma(variance, shape = alphaPrior, rate = betaPrior)
    return(log(lik))
  }

Posterior Likelihood

Combine the above functions to create a posterior likelihood (log). This allows the for loo below to use constantly updated functions as opposed to hardcoding within the forloop. This is the difference between running three seperate for loops for each Metropolis step (incorrect) and a “conditional” for loop (correct)

posteriorLik<-function(b0, b1, sd){

llik<-dataLik(b0,b1,sd,wtData) + normPriorL(b0, priorMean, priorSD) + normPriorL(b1, priorMean, priorSD) + invGamma(sd,alphaPrior,betaPrior)

return(llik)
}

Function defintions

This is needed for gibbs step - which I am not using now…

#sampNormVar<-function(yPred,alpha,beta,data){
#    require(invgamma)
#    yObs<-wtData$weight_lbs    # call our dataframe
#    n<-length(yPred)
#    alphaPrime<-0.001
#    betaPrime<-0.001
#    rinvgamma(n=1,shape=0.001, rate=0.001)
#  }
nIter<-1000
proposalSD<-5
priorMean<-0; priorSD<-100; alphaPrior<-0.001; betaPrior<-0.001
b0<-b1<-sd<-vector() # shorthand, these params are vectors
b0[1]<-1; b1[1]<-1; sd[1]<-1 # Initial values

THE FOLLOWING R CHUNK (FOR LOOP) IS BROKEN AND CANNOT RUN/KNIT TO GITHUB WITHOUT COMMENTING IT OUT. WORK IS REQUIRED TO DEBUG WHICH HAS BEEN UNSUCCESSFUL

# for(i in 1:nIter){ # Iterations in sample: i loop

# first sample b0 using a Metropolis step
b0Star<-rnorm(n=1,mean = b0[i], sd = proposalSD)
r<-min(1, exp( posteriorLik(b0Star, b1[i], sd[i]) - posteriorLik(b0[i], b1[i], sd[i]) ) )
if(runif(1)<r) {
  b0[i+1]<-b0Star
} else {
  b0[i+1]<-b0[i]
} 

# then sample b1 using a Metropolis step
b1Star<-rnorm(n=1,mean = b1[i], sd = proposalSD)
r<-min(1, exp( posteriorLik(b0[i], b1Star, sd[i]) - posteriorLik(b0[i], b1[i], sd[i]) ) )# exp
if(runif(1)<r) {
  b1[i+1]<-b1Star
} else {
  b1[i+1]<-b1[i]
} 

# then sample sd^2 using a Gibbs step or a Metropolis step
#### GIBBS
#wtPred<-… #myVar<-sampNormVar(…) #sd[i+1]<-sqrt(myVar)

#METROPOLIS STEP IS STILL BROKEN. 
sdStar<-rinvgamma(n=1,shape=alphaPrior, rate=betaPrior)
r<-min(1, exp( posteriorLik(b0[i], b1[i], sdStar) - posteriorLik(b0[i], b1[i], sd[i]) ) )
if(runif(1)<r) {
  sd[i+1]<-sdStar
} else {
  sd[i+1]<-sd[i]
} 

print(sdStar)
print(r)
print(sd[i])

}

Examining results using Coda

library(coda)
## Warning: package 'coda' was built under R version 3.3.3
#burnIn<-...
#dataMCMCburnIn<-as.mcmc(cbind(b0[-c(1:burnIn)],b1[-c(1:burnIn)],sd[-c(1:burnIn)]))
#densplot(dataMCMCburnIn)
#HPDinterval(dataMCMCburnIn)
#autocorr.plot(dataMCMCburnIn)
#plot(dataMCMCburnIn)

SCRATCH WORK FROM EARLIER IN THE WEEK

write a likelihood function

likNorm<-function (x, mean, sd){ likeNorm <-dnorm (x=x, mean=mean,sd=sd) return(likeNorm) }

write a normal prior function to be used for b0 and b1. Hardcoded mean and sd so variance is high so it has less influence on the outcome of our model. Plotted this would give use a broad, flat prior

priorNorm<-function (x){ prior <-dnorm (x=x, mean=0,sd=100) return(prior) }

write a function for sd using inverse gamma. Don’t know what shape and rate are but we’re hardcoding so standby….

library(invgamma) priorInvGamma<-function (x){ sd <-dinvgamma (x=x, shape = 0.001, rate = 0.001 ) return(sd) }

FROM LAST WEEK’S EXERCISE - do we run 3 for loops for each parameter?

write a proposal distribution

proDist<- function (currentP, sd) { p<-rnorm(n=1, mean = currentP, sd=sd) if(p>0 &p<1) return(p) proDist(currentP,sd) }

assign some parameters for the mcmc model

noIterations<-10000 mySD<-0.2 # copied from last week…

create a empty vector

NEED 3 VECTORS

B0 <-vector() B0[1]<-0.5 B1 <-vector() B1[1]<-0.5 sd <-vector() sd[1]<-0.5

MCMC sampler for b0 with normal prior

for(i in 1:noIterations){ ####### COpY THIS, pASTE AND REpEAT FOR B0, B1 AND SD SO WE HAVE ALL pARAMETERS CONDITIONAL ON EACH OTHER ###### proposedP<-proDist(p1[i], mySD) r<-min(1,(likNorm(proposedP, B0[i], sd) * priorNorm(proposedP)) / (likNorm(p1[i], ymean, 1.5) * priorNorm(p1[i]))) if(runif(1)<r){ #runif is the “spinner” in the island hop example p1[i+1] <-proposedP } else { p1[i+1] <- p1[i] } ######### }

create a empty vector

p2<-vector() p2[1]<-0.5

MCMC sampler for b1 with normal prior

for(i in 1:noIterations){ proposedP<-proDist(p2[i], mySD) r<-min(1,(likNorm(proposedP, ymean, 1.5) * priorNorm(proposedP)) / (likNorm(p2[i], ymean, 1.5) * priorNorm(p2[i]))) if(runif(1)<r){ #runif is the “spinner” in the island hop example p2[i+1] <-proposedP } else { p2[i+1] <- p2[i] } }

create a empty vector

p3<-vector() p3[1]<-0.5

MCMC sampler for sd (invgamma) with normal prior

for(i in 1:noIterations){ proposedP<-proDist(p3[i], mySD) r<-min(1,(likNorm(proposedP, ymean, 1.5) * priorInvGamma(proposedP)) / (likNorm(p3[i], ymean, 1.5) * priorInvGamma(p3[i]))) if(runif(1)<r){ #runif is the “spinner” in the island hop example p3[i+1] <-proposedP } else { p3[i+1] <- p3[i] } }

library(coda)

CONVERTS out p vector from above into an mcmc

The function mcmc is used to create a Markov Chain Monte Carlo object. The input data are taken to be a vector, or a matrix with one column per variable.

mcmcB0<-mcmc(p1) mcmcB1<-mcmc(p2) mcmcSD<-mcmc(p3)

trace plot

traceplot(mcmcB0) traceplot(mcmcB1) traceplot(mcmcSD)

density plot

densplot(mcmcB0) densplot(mcmcB1) densplot(mcmcSD)

both trace and density plotted together

plot(mcmcB0) plot(mcmcB1) plot(mcmcSD)

autcorelation

autocorr.plot(mcmcB0) autocorr.plot(mcmcB1) autocorr.plot(mcmcSD)

estimate of the 95% HDI

HPDinterval(mcmcB0, prob = 0.95) HPDinterval(mcmcB1, prob = 0.95) HPDinterval(mcmcSD, prob = 0.95)

Homework notes

Inverse gamma is a prior commonly used fo the variance of a normal distribution because it is conjugate on the normal

require(invgammma) library

use metropolis algorithm for b0 and another for b1 use gibbs sampler for variance

create likelihood functions for all parameters?

sTEPS

FOR LOOP FOR I ITERATIONS

1 metropolis - sample b0 2 metropolis- sample b1 3 gibbs - sample sigma^2

metropolis steps require “r” which requires posterior lik (function) which requires data lik (function) and 3 prior liks (function) 3 functions go into metropolis function to calculate “r” (accept/reject) look to last week’s work to get “r” and other code

METROOLIS ALROGIRTH: 1- initialize the chain at some inital value of p 2- generate a proposed value of p - ex here, b’~Normal(b0[i], sd) 3- calculate a accetance probablity r 4- accept poposed value of p with probability r 5- repeat n times