R 中的快速 binom.test 以获得多个置信度?

问题描述 投票:0回答:1

我有一个数据集

df
,其中包含两个变量:
successes
(x)和
trials
(n)。我希望获得该数据集中每一行不同置信水平的
stats::binom.test()
结果(est、lower 和 upper ci)。它应该能够处理 n =
NA
0

我编写了下面的代码,但我希望能够更快地实现,因为我的真实数据集非常大(3000 万行)。任何可以提高速度的建议都将受到赞赏(矢量化、并行化......?),最好是“tidyapproach”,但不是必需的。

library(tidyverse)
#function: 
#uses lapply to loop over the conf.levels for the same x&n pair:
binom_test <- function(x, n, conf.levels = c(0.50, 0.75, 0.90, 0.95)) {
  if (is.na(n) | n < 1) {
    return(data.frame(
      conf_level = conf.levels,
      est = rep(NA, length(conf.levels)),
      x = rep(NA, length(conf.levels)),
      n = rep(NA, length(conf.levels)),
      lower_ci = rep(NA, length(conf.levels)),
      upper_ci = rep(NA, length(conf.levels))
    ))
  } else {
    results <- lapply(conf.levels, function(conf.level) {
      test <- binom.test(x = x,
                         n = n,
                         conf.level = conf.level)
      c(
        conf_level = conf.level,
        x = x,
        n = n,
        est = test$estimate[[1]],
        lower_ci = test$conf.int[1],
        upper_ci = test$conf.int[2]
      )
    })
    
    results_df <- as.data.frame(do.call(rbind, results))
    return(results_df)
  }
}

# data
df <- tibble(
  successes = round(runif(1000, 0, 100)),
  trials = round(runif(1000, 0, 100)) + 100
)

# requires rowwise (slower)
df %>%
  rowwise() %>%
  mutate(beta = list(binom_test(x = successes, n = trials))) %>%
  unnest(beta, names_sep = "_")
#> # A tibble: 4,000 × 8
#>    successes trials beta_conf_level beta_x beta_n beta_est beta_lower_ci
#>        <dbl>  <dbl>           <dbl>  <dbl>  <dbl>    <dbl>         <dbl>
#>  1        78    180            0.5      78    180    0.433         0.406
#>  2        78    180            0.75     78    180    0.433         0.389
#>  3        78    180            0.9      78    180    0.433         0.371
#>  4        78    180            0.95     78    180    0.433         0.360
#>  5        87    140            0.5      87    140    0.621         0.590
#>  6        87    140            0.75     87    140    0.621         0.570
#>  7        87    140            0.9      87    140    0.621         0.549
#>  8        87    140            0.95     87    140    0.621         0.536
#>  9        89    129            0.5      89    129    0.690         0.658
#> 10        89    129            0.75     89    129    0.690         0.637
#> # ℹ 3,990 more rows
#> # ℹ 1 more variable: beta_upper_ci <dbl>

# avoids rowwise but still slow
df %>%
  mutate(beta = pmap(list(successes, trials), binom_test)) %>%
  unnest(beta, names_sep = "_")
#> # A tibble: 4,000 × 8
#>    successes trials beta_conf_level beta_x beta_n beta_est beta_lower_ci
#>        <dbl>  <dbl>           <dbl>  <dbl>  <dbl>    <dbl>         <dbl>
#>  1        78    180            0.5      78    180    0.433         0.406
#>  2        78    180            0.75     78    180    0.433         0.389
#>  3        78    180            0.9      78    180    0.433         0.371
#>  4        78    180            0.95     78    180    0.433         0.360
#>  5        87    140            0.5      87    140    0.621         0.590
#>  6        87    140            0.75     87    140    0.621         0.570
#>  7        87    140            0.9      87    140    0.621         0.549
#>  8        87    140            0.95     87    140    0.621         0.536
#>  9        89    129            0.5      89    129    0.690         0.658
#> 10        89    129            0.75     89    129    0.690         0.637
#> # ℹ 3,990 more rows
#> # ℹ 1 more variable: beta_upper_ci <dbl>

#parallel?
# library(furrr)
# plan(multisession, workers = 5)
# df %>%
#   mutate(beta = furrr::future_pmap(list(successes, trials), binom_test)) %>%
#   unnest(beta, names_sep = "_")

创建于 2023-11-24,使用 reprex v2.0.2

r statistics vectorization lapply
1个回答
0
投票

估价不过是

x/n
。当
qbeta(0.025, x, n-x+1)
时,95%-CI 的下限为
x > 0
,当
qbeta(0.975, x+1, n-x)
时,95%-CI 的上限为
x < n
Clopper-Pearson 置信区间)。另一个置信水平类似。
qbeta
函数已向量化,因此请使用此函数代替
binom.test
。您必须使用上面链接中给出的公式来处理特殊情况
x=0
x=n

© www.soinside.com 2019 - 2024. All rights reserved.