So in the throes of textbook writing I came across this bit of Bayesian arcania. Suppose
[tex]\theta | \mu, \sigma^2 \sim N(\mu,\sigma^2)[/tex]
where
[tex]\mu \sim N(b_0, B_0)[/tex]
and
[tex]\sigma^2 \sim \mbox{inverse-Gamma}(\nu/2, \nu\sigma^2_0/2),[/tex]
and we want to know the marginal density of [tex]\theta[/tex],
[tex] p(\theta) = \int_0^\infty \int_{-\infty}^\infty p(\theta | \mu, \sigma^2) p(\mu) p(\sigma^2) d\mu d\sigma^2.[/tex]
This model differs from the usual conjugate set-up in that we have independence between [tex]\mu[/tex] and [tex]\sigma^2[/tex], instead of the usual [tex]p(\mu,\sigma^2) = p(\mu|\sigma^2)p(\sigma^2)[/tex]; when we use the latter version of the prior and we have a normal model for [tex]\theta[/tex] as above, then the marginal density for [tex]\theta[/tex] is a [tex]t[/tex] density.
But what happens when the model is changed subtly, as above? I got as far as seeing that the marginal for [tex]\theta[/tex] would be something like a [tex]t[/tex] density, but stopped when I got as far as
[tex]p(\theta) \propto \displaystyle\int_0^\infty \exp \left[ \displaystyle\frac{-(\theta - b_0)^2}{2 (\sigma^2 + B_0)}\right] p(\sigma^2) d\sigma^2 [/tex]
where the first term is (up to a constant of proportionality) [tex]p(\theta | \sigma^2)[/tex], what you get after marginalizing with respect [tex]\mu[/tex] and is the kernel of a normal density over [tex]\theta[/tex] with mean [tex]b_0[/tex] and variance equal to [tex]\sigma^2 + B_0[/tex] (i.e., this is nothing more than the familiar result from analysis of variance that the total variance is equal to the sum of the between and within variances).
Ok, so now we have to marginalize with respect to [tex]\sigma^2[/tex]. In the conjugate case this last step is easy because [tex]\sigma^2[/tex] usually appears multiplicatively (i.e., [tex]\sigma^2 k[/tex]), rather than in the additive form as in the denominator, above. I’m guessing (a) the result I’m after is in a book in my office that I’ll find on Monday; (b)
is some kind of mixture of a [tex]t[/tex] and a normal?
I simulated draws from the marginal for [tex]\theta[/tex] using a Monte Carlo based method of composition (three lines of R); I seem to be getting back something more leptokurtic than the normal, but actually fitting a [tex]t[/tex] density to the simulated values produced weird results, even with [tex]10^7[/tex] simulated values. See the figure below (clickable thumbail), generated with this R code:
library(pscl)
m <- 1e7
nu <- 14
b0 <- .225
B0 <- (.15/4)^2
omega2.0 <- .005
omega2 <- rigamma(n=m,alpha=nu/2,beta=nu*omega2.0/2)
tau <- rgamma(n=m,nu/2,nu*omega2.0/2)
mu0 <- rnorm(n=m,b0,sd=sqrt(B0))
theta <- rnorm(n=m,mean=mu0,sd=sqrt(omega2))
pdf(file="logHist.pdf",
w=11,
h=10)
par(mfrow=c(1,2))
hist(theta,501,
prob=TRUE,
col=gray(.85),
lwd=.5,
ylab="density",
main=NULL,
xlab=expression(theta))
mynorm2 <- function(x)dnorm(x,mean=mean(theta),sd=sd(theta))
curve(mynorm2,
from=min(theta),to=max(theta),n=1001,
add=TRUE,
col="red",
lwd=1)
axis(1)
legend(x="topleft",
bty="n",
col=c("black","red"),
lwd=c(1,1),
legend=c("histogram estimate",
"normal MLE"))
blah <- hist(theta,501,plot=FALSE)
plot(blah$mids,blah$intensities,type="l",log="y",
xlab=expression(theta),
ylab="density (log scale)",
axes=FALSE)
lines(blah$mids,
dnorm(blah$mids,mean=mean(theta),sd=sd(theta)),
col="red",
lwd=1)
axis(1)
legend(x="topleft",
bty="n",
col=c("black","red"),
lwd=c(1,1),
legend=c("histogram estimate",
"normal MLE"))
dev.off()
