如何在 R 中获得 Spearman 相关矩阵的 Bootstrap BCa 置信区间？

Question

R专家晚上好，

我正在使用当前项目研究中的所有变量在 R 中生成 Spearman 相关矩阵。为了与我正在进行的其他分析保持一致，我想使用围绕每个 Spearman 系数进行 10,000 次迭代来构建 BCa 自举置信区间。但是，我还没有找到有效且可靠的方法来实现这一目标。

我使用以下代码创建了带有系数和 p 值的 Spearman 相关矩阵。请注意，为了这个示例，我生成了随机数据。

library(psych)
library(rstatix)

df_test <- data.frame(
    recall=c(45, 1, 32, 17, 79, 15, 75, 100, 43, 80, 74, 91, 60, 54, 67, 26, 97, 53, 51, 30),
    recog=c(90, 29, 93, 73, 34, 68, 78, 56, 92, 85, 35, 81, 7, 58, 4, 52, 82, 31, 6, 23),
    hits=c(77, 89, 44, 8, 70, 96, 76, 62, 95, 27, 49, 12, 16, 28, 83, 2, 36, 10, 61, 86),
    misses=c(59, 78, 14, 44, 86, 61, 80, 72, 25, 93, 5, 42, 64, 95, 73, 54, 7, 67, 11, 53),
    false_alarms=c(32, 34, 55, 41, 77, 76, 89, 36, 12, 100, 70, 62, 47, 81, 90, 63, 13, 83, 79, 38)
)

df_test.cor <- cor(df_test, y = NULL, use = "pairwise.complete.obs",
    method = c("spearman"))

#Get coefficients
df_test.mat <- cor_mat(
  df_test,
  vars = NULL,
  method = "spearman",
  alternative = "two.sided",
  conf.level = 0.95
)

df_test.mat

#Get sig values
df_test.pmat <- cor_pmat(
  df_test,
  vars = NULL,
  method = "spearman",
  alternative = "two.sided",
  conf.level = 0.95
)

df_test.pmat

ggcorrplot(df_test.mat,
           type = "lower", method = "circle", colors = c("turquoise3", "khaki1", "violetred3"), p.mat = df_test.pmat)


cor.ci(df_test, keys = NULL, n.iter = 10000,  p = 0.050, poly = FALSE, method = "spearman")

结果如下：

1。相关系数与 p 值：

2。可视化

3.使用

psych

包引导 95% CI

这些置信区间存在许多问题，列举如下：

cor_pmat()

 包中的

rstatix

 命令生成的值不对应。

因此，我的问题是：

如何使用 BCa 方法在整个相关矩阵中围绕 Spearman 系数构建自举置信区间，精度高达小数点后 4 位？

Answer 1

我得到了与 Mikko Marttila 非常相似的解决方案，尽管我仍然认为没有必要过多关注 BCa 间隔。我对相关性为 0.5 的标准二元正态分布进行了模拟（斯皮尔曼等级相关性也是已知的），并确定了最常见的自举置信区间的覆盖属性：

## Using bivariate normal sampling distribution
rho <- .5
sigma <- matrix(rho, ncol=2, nrow=2)
diag(sigma) <- 1

## True Spearman rank correlation + coverage function
GROUND_TRUTH <- (6/pi)*asin(rho/2)
covers <- function(ci) (ci[1] <= GROUND_TRUTH) & (GROUND_TRUTH <= ci[2])

## Function to bootstrap a single Spearman correlation
bootfun <- function(data, i) {
   r <- cor(data[i,], method = "spearman")
   r[lower.tri(r)]
}

## Generate one sample & bootstrap it
## Returns whether the obtained intervals cover the true parameter
one_boot <- function(seed, n=100) {
   set.seed(seed)
   data <- mvtnorm::rmvnorm(n, sigma = sigma)
   b <- boot::boot(data, bootfun, R=1E4)
   ci <- boot::boot.ci(b, type = c("norm", "basic", "perc", "bca"))
   c("norm" = covers(ci$normal[2:3]),  "basic"= covers(ci$basic[4:5]),
     "perc" = covers(ci$percent[4:5]), "bca"  = covers(ci$bca[4:5]))
}

## Generate many samples & bootstrap each of them
## Returns coverage properties of each interval
future::plan("multisession", workers=4)
many_boots <- future.apply::future_vapply(seq_len(1E4), one_boot,
                                          logical(4), future.seed = TRUE) |>
   matrixStats::rowMeans2()
#>   norm  basic   perc    bca 
#> 0.9399 0.9297 0.9535 0.9544

如您所见，原始百分位数至少具有与 BCa 区间一样好的覆盖特性，并且这是在理论上完美的标准法线（其中加速因子也是精确的）下。如果您的抽样分布具有不太明确的偏度或不平滑，则此间隔将比其他替代方案表现

更差。

作为奖励，以下是如何通过重采样（在零值下！）获得 Spearman 相关性的单个两侧

P 值：

permute_spearman <- function(x, y, iters=1E4) {
   t0 <- cor(x, y, method="spearman")
   mean(vapply(seq_len(iters), \(d) {
      abs(cor(x, y[sample(seq_along(y))], method="spearman")) > t0
   }, logical(1)))
}

Answer 2

您可以使用

boot 包计算 BCa 间隔。

首先，为数据子集

i

 的 Spearman 相关性创建一个统计函数：

spearman <- function(d, i) {
  rho <- cor(d[i, ], method = "spearman")
  rho[lower.tri(rho)]
}

然后，使用

boot()

 获取引导样本：

set.seed(42)
boot_out <- boot::boot(df_test, spearman, R = 10000)
boot_out
#> 
#> ORDINARY NONPARAMETRIC BOOTSTRAP
#> 
#> 
#> Call:
#> boot::boot(data = df_test, statistic = spearman, R = 10000)
#> 
#> 
#> Bootstrap Statistics :
#>         original       bias    std. error
#> t1*   0.09172932 -0.004161858   0.2181451
#> t2*  -0.22105263  0.008094763   0.2623494
#> t3*   0.13684211 -0.009041776   0.2462778
#> t4*   0.22406015 -0.004986228   0.2420079
#> t5*  -0.05413534  0.003039217   0.2235030
#> t6*  -0.18345865  0.006896525   0.2341026
#> t7*  -0.29022556  0.006665581   0.2469688
#> t8*   0.08872180 -0.004816326   0.1924127
#> t9*  -0.20902256  0.011034138   0.2315252
#> t10*  0.48120301 -0.021141040   0.2029307

最后，用

boot.ci()

 计算置信区间。

index

 指定
您想要输出统计数据的哪个元素的间隔：

boot::boot.ci(boot_out, type = "bca", index = 10)
#> BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
#> Based on 10000 bootstrap replicates
#> 
#> CALL : 
#> boot::boot.ci(boot.out = boot_out, type = "bca", index = 10)
#> 
#> Intervals : 
#> Level       BCa          
#> 95%   (-0.0322,  0.7649 )  
#> Calculations and Intervals on Original Scale

向量

index

 具有不同的含义（第二个元素被解释为给出第一个元素的方差），因此要获取所有 CI
您需要显式迭代，例如使用

lapply()

lapply(seq_along(boot_out$t0), function(i) {
  boot::boot.ci(boot_out, type = "bca", index = i)
})

如果您只需要输出中的间隔，则可以从

$bca

 元素中提取它们：

sapply(seq_along(boot_out$t0), function(i) {
  boot::boot.ci(boot_out, type = "bca", index = i)$bca[, 4:5]
})
#>        [,1]       [,2]       [,3]       [,4]       [,5]       [,6]       [,7]
#>  -0.3439443 -0.6648910 -0.3743482 -0.3385006 -0.4651028 -0.6057627 -0.7137479
#>   0.5040802  0.3668189  0.5936317  0.6228635  0.4083453  0.3079268  0.2545514
#>        [,8]       [,9]      [,10]
#>  -0.2991324 -0.6217852 -0.0321938
#>   0.4506211  0.2709281  0.7648544

如何在 R 中获得 Spearman 相关矩阵的 Bootstrap BCa 置信区间？

问题描述投票：0回答：2

2个回答

最新问题

如何在 R 中获得 Spearman 相关矩阵的 Bootstrap BCa 置信区间？

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2