如何使用 JAGS 对来自不同参数族的有限分量的混合进行建模?

问题描述 投票:0回答:2

想象一个底层过程,它从概率为 $ lpha$ 的正态分布和概率为 $1 - lpha$ 的均匀分布中抽取一个数字。 因此,此过程生成的观察到的数字序列遵循分布 $f$,即 2 个分量混合以及 $lpha$ 和 $1 - lpha$ 的混合权重。 当观察到的序列是唯一的输入,但参数族已知时,您将如何使用 JAGS 对这种混合物进行建模?

示例(R 语言):

set.seed(8361299)
N <- 100
alpha <- 0.3
mu <- 5
max <- 50
# Which component to choose from?
latent_class <- rbinom(N, 1, alpha)
Y <- ifelse(latent_class, runif(N, min=mu, max=max), rnorm(N, mean=mu))

生成的(观察到的)Y 如下所示:

通过JAGS,应该可以获得混合重量,以及已知组分的参数?

r jags
2个回答
3
投票

相同参数分布的混合模型在 JAGS/BUGS 中非常简单,但具有不同参数响应的混合模型(如您的)则有点棘手。一种方法是使用“个数技巧”,我们手动计算响应的可能性(选择模型潜在部分指定的两个分布之一)并将其拟合到伯努利试验的(假)响应每个数据点。例如:

# Your data generation:
set.seed(8361299)
N <- 100
alpha <- 0.3
mu <- 5
max <- 50
# Which component to choose from?
latent_class <- rbinom(N, 1, alpha)
Y <- ifelse(latent_class, runif(N, min=mu, max=max), rnorm(N, mean=mu))

# The model:
model <- "model{

    for(i in 1:N){

        # Log density for the normal part:
        ld_norm[i] <- logdensity.norm(Y[i], mu, tau)

        # Log density for the uniform part:
        ld_unif[i] <- logdensity.unif(Y[i], lower, upper)

        # Select one of these two densities:
        density[i] <- exp(ld_norm[i]*norm_chosen[i] + ld_unif[i]*(1-norm_chosen[i]))

        # Generate a likelihood for the MCMC sampler:
        Ones[i] ~ dbern(density[i])

        # The latent class part as usual:
        norm_chosen[i] ~ dbern(prob)

    }

    # Priors:
    lower ~ dnorm(0, 10^-6)
    upper ~ dnorm(0, 10^-6)
    prob ~ dbeta(1,1)
    mu ~ dnorm(0, 10^-6)
    tau ~ dgamma(0.01, 0.01)

    # Specify monitors, data and initial values using runjags:
    #monitor# lower, upper, prob, mu, tau
    #data# N, Y, Ones
    #inits# lower, upper
}"

# Run the model using runjags (or use rjags if you prefer!)
library('runjags')

lower <- min(Y)-10
upper <- max(Y)+10

Ones <- rep(1,N)

results <- run.jags(model, sample=20000, thin=1)
results

plot(results)

这似乎可以很好地恢复你的参数(你的 alpha 是 1-prob),但要注意自相关(和收敛)。

马特


编辑:既然您询问了如何推广到两个以上的发行版,这里是等效的(但更通用)代码:

# The model:
model <- "model{
    for(i in 1:N){
        # Log density for the normal part:
        ld_comp[i, 1] <- logdensity.norm(Y[i], mu, tau)
        # Log density for the uniform part:
        ld_comp[i, 2] <- logdensity.unif(Y[i], lower, upper)
        # Select one of these two densities and normalise with a Constant:
        density[i] <- exp(ld_comp[i, component_chosen[i]] - Constant)
        # Generate a likelihood for the MCMC sampler:
        Ones[i] ~ dbern(density[i])
        # The latent class part using dcat:
        component_chosen[i] ~ dcat(probs)
    }
    # Priors for 2 parameters using a dirichlet distribution:
    probs ~ ddirch(c(1,1))
    lower ~ dnorm(0, 10^-6)
    upper ~ dnorm(0, 10^-6)
    mu ~ dnorm(0, 10^-6)
    tau ~ dgamma(0.01, 0.01)
    # Specify monitors, data and initial values using runjags:
    #monitor# lower, upper, probs, mu, tau
    #data# N, Y, Ones, Constant
    #inits# lower, upper, mu, tau
}"

library('runjags')

# Initial values to get the chains started:
lower <- min(Y)-10
upper <- max(Y)+10
mu <- 0
tau <- 0.01

Ones <- rep(1,N)

# The constant needs to be big enough to avoid any densities >1 but also small enough to calculate probabilities for observations of 1:
Constant <- 10

results <- run.jags(model, sample=10000, thin=1)
results

此代码将适用于您需要的任意多个分布,但预计自相关性随着分布的增加而呈指数级恶化。


0
投票

请谁能帮我解决这个问题

mod1_字符串<- " model { for (i in 1:N) { # first distribution the negative binomial part: y[i, 1] ~ dnegbin(prob[i], phi) prob[i] <- phi / (phi + eps[i] * mu[i]) log(mu[i]) <- b1[P[i]] * x6[i] + b2[P[i]] * x28[i] eps[i] ~ dgamma(f[i], t[V[i]]) f[i] <- 1 + z[i] z[i] ~ dbern(k[V[i]]) P[i] ~ dcat(W[])

# second distribution, poisson:
y[i, 2] ~ dpois(mu2[i])
mu2[i] <- b1[V[i]] * x6[i] + b2[V[i]] * x28[i]

V[i] ~ dcat(W[])



# The latent class part using dcat:
component_chosen[i] ~ dcat(probs)

# Select one of these two densities and normalize with a Constant:
density[i] <- exp(y[i, component_chosen[i]])   }

使用狄利克雷分布的 2 个参数的先验:probs ~

ddirch(c(1, 1))

for (j in 1:2) { # 先验 b1[j] ~ dnorm(0, 0.01) b2[j] ~ dnorm(0, 0.01) k[j] ~ 杜尼夫(0, 1) t[j] <- (1 - k[j]) / k[j] mean.eps[j] <- (t[j] + 2) / (t[j] * (t[j] + 1)) log.mean.eps[j] <- log(mean.eps[j]) }

 W[1] ~ dunif(0, 1)   W[2] <- 1 - W[1]   phi ~ dgamma(0.1, 0.1) }"
© www.soinside.com 2019 - 2024. All rights reserved.