模拟：将 Integrate() 与自定义函数结合使用

Question

在尝试为心理语言学创建一个简单的模拟时，我遇到了以下错误：

> expected_success_rate(170)
> Error in if (height \>= threshold) { : the condition has length \> 1
> Called from: prob_height_given_one(h, thresh)

我认为在这种情况下，

expected_success_rate

>的定义一定有问题。这个例子应该是可重现的。

height_dist \<- c(177.5, 6.35)

#TWO CASES

#FIRST : h \< θ   --\>  null utterance, u0

# only prior distribution can inform degree of belief that referent is tall to a given degree: φ(h|u0,θ) = φ(h)

prob_height_given_null <- function(height) {
  dnorm(height, mean = height_dist[1], sd = height_dist[2])
}

# SECOND: h >= θ  --> positive form, u1

# if h ≥ θ, then listener can update by conditioning on u1, which yields a new distribution

prob_height_given_one <- function(height, threshold) {
  if (height >= threshold) {
    integrand <- function(x) {
      dnorm(x, mean = height_dist[1], sd = height_dist[2])
    }
    normaliser <- integrate(integrand, threshold, Inf)
    return(prob_height_given_null(height)/normaliser$value)
  }
  else {
    return(0)
  }
}

# Expected success rate (still non-normalised)
expected_success_rate <- function(thresh) {
  integrand_LHS <- function(h) {
    prob_height_given_null(h) * prob_height_given_null(h)
    }
  value_LHS <- integrate(integrand_LHS, -Inf, thresh)
  integrand_RHS <- function(h) {
    prob_height_given_null(h) * prob_height_given_one(h, thresh)}

  value_RHS <- integrate(integrand_RHS, thresh, Inf)
  
  return(value_LHS$value + value_RHS$value)
}

expected_success_rate(170)

integrate(expected_success_rate, -Inf, true_height)

首先，一个问题似乎源于 integrand_LHS 和 integrand_RHS 的定义。

副驾驶的一个建议是使用 Vectorize() 来定义每个被积函数 - 但我不相信它，因为我无法理解其动机。

出现的另一个问题是我打算使用卢斯选择规则。为了简单起见，将标准化放在一边，底线是预期成功率本身需要积分才能产生感兴趣的数量：

integrate(expected_success_rate, -Inf, true_height)

我在expected_success_rate中尝试了Vectorize()建议，如下：

expected_success_rate <- function(threshold) {
  integrand_LHS <- Vectorize(function(h) prob_height_given_null(h) * prob_height_given_null(h))
  value_LHS <- integrate(integrand_LHS, -Inf, threshold)
  
  integrand_RHS <- Vectorize(function(h) prob_height_given_null(h) * prob_height_given_one(h, threshold))
  value_RHS <- integrate(integrand_RHS, threshold, Inf)
  
  return(value_LHS$value + value_RHS$value)
}

它似乎有效，因为预期成功率图看起来似乎合理。

但是随后这里生成了一个新错误 - 正如您所看到的，它似乎源于该函数的定义。

> integrate(expected_success_rate, -Inf, true_height)
 Error in integrate(integrand_LHS, -Inf, thresh) : 
  length(upper) == 1 is not TRUE

有人可以帮忙吗？更广泛地说，我对哪些语言最适合操纵贝叶斯方程的观点感兴趣。

Answer 1

integrate

期望函数被向量化（它将传递向量输入，并期望向量输出）。

if

语句未向量化，因此会出现错误。

直接错误可以通过用

prob_height_given_one

包围

expected_success_rate

和

Vectorize

的定义来解决。

但是，这里没有适当的理由使用

integrate

。对正态 PDF 进行平方会得到另一个正态 PDF（标准差等于 sigma/sqrt(2)，然后按常数缩放），因此可以使用

pnorm

。我认为

if

声明没有任何理由：基于积分，

height

将始终大于

threshold

。这使得

expected_success_rate

得以大大简化：

expected_success_rate2 <- function(thresh) {
  normalizer <- pnorm(thresh, height_dist[1], height_dist[2], FALSE)
  value_LHS <- pnorm(thresh, height_dist[1], s)/height_dist[2]/sqrt(pi)/2
  value_RHS <- pnorm(thresh, height_dist[1], s, FALSE)/C/normalizer
  value_LHS + value_RHS
}

s <- height_dist[2]/sqrt(2)
C <- 2*height_dist[2]*sqrt(pi)

从中我们可以看到

expected_success_rate

正在将与正态分布成比例（常数取决于

thresh

）的似然性积分。对于小于

thresh

的值，它是

，对于大于

thresh

的值，它是

C*normalizer

，其中

normalizer

是

thresh

的函数。这将在各处给出正值，因此在非有限区间内积分

expected_success_rate

将得到

Inf

。

演示可以使用

pnorm

对普通 PDF 的平方进行积分：

height_dist <- c(177.5, 6.35)
f <- function(x) dnorm(x, height_dist[1], height_dist[2])
f1 <- function(x) f(x)*f(x)
integrate(f1, -Inf, 180)
#> 0.03159284 with absolute error < 1.5e-06
pnorm(180, height_dist[1], height_dist[2]/sqrt(2))/height_dist[2]/sqrt(pi)/2
#> [1] 0.03159284

演示

expected_success_rate

替代版本的（数学）等价性：

prob_height_given_one <- Vectorize(function(height, threshold) {
  if (height >= threshold) {
    integrand <- function(x) {
      dnorm(x, mean = height_dist[1], sd = height_dist[2])
    }
    normaliser <- integrate(integrand, threshold, Inf)
    return(prob_height_given_null(height)/normaliser$value)
  }
  else {
    return(0)
  }
})

expected_success_rate <- Vectorize(function(thresh) {
  integrand_LHS <- function(h) {
    prob_height_given_null(h) * prob_height_given_null(h)
  }
  value_LHS <- integrate(integrand_LHS, -Inf, thresh)
  integrand_RHS <- function(h) {
    prob_height_given_null(h) * prob_height_given_one(h, thresh)}
  
  value_RHS <- integrate(integrand_RHS, thresh, Inf)
  
  return(value_LHS$value + value_RHS$value)
})

expected_success_rate2 <- function(thresh) {
  normalizer <- pnorm(thresh, height_dist[1], height_dist[2], FALSE)
  value_LHS <- pnorm(thresh, height_dist[1], s)
  value_RHS <- pnorm(thresh, height_dist[1], s, FALSE)/normalizer
  (value_LHS + value_RHS)/C
}

s <- height_dist[2]/sqrt(2)
C <- 2*height_dist[2]*sqrt(pi)
rbind(
  expected_success_rate(170:180),
  expected_success_rate2(170:180)
)
#>            [,1]       [,2]       [,3]       [,4]       [,5]       [,6]
#> [1,] 0.05012842 0.05185676 0.05388938 0.05618745 0.05866877 0.06120528
#> [2,] 0.05012842 0.05185677 0.05388939 0.05618745 0.05866877 0.06120528
#>           [,7]       [,8]       [,9]      [,10]      [,11]
#> [1,] 0.0636293 0.06574995 0.06737815 0.06835568 0.06858188
#> [2,] 0.0636293 0.06574995 0.06737815 0.06835568 0.06858188

基准测试：

microbenchmark::microbenchmark(
  esr1 = expected_success_rate(170:180),
  esr2 = expected_success_rate2(170:180)
)
#> Unit: microseconds
#>  expr     min       lq      mean  median       uq      max neval
#>  esr1 69075.9 70143.35 72992.418 71919.7 74143.90 123038.1   100
#>  esr2     5.5     6.20    45.632    10.8    18.85   3329.4   100

最后一点。在尾部对概率密度进行积分可能会很棘手。

expected_success_rate2

保持精度，而

expected_success_rate

则不保持精度。

expected_success_rate(120)
#> [1] 1.716383e-05
expected_success_rate2(120)
#> [1] 0.04442438

模拟：将 Integrate() 与自定义函数结合使用

问题描述投票：0回答：1

1个回答

最新问题

模拟：将 Integrate() 与自定义函数结合使用

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1