如何在avg_predictions()或hypotheses()函数中使用参数术语名称？

Question

我正在尝试使用强大的

avg_predictions()

包中的

hypothesis

函数和

marginaleffects

参数来估计组之间预测概率的差异。但是，当

by

参数中包含多个变量时，我在术语名称方面遇到了麻烦。我想使用术语名称而不是

b1

、

b2

等来标识每个参数的位置，因为当我针对具有相同协变量列表的不同结果运行

avg_predictions()

时，位置不一致。下面，我将代码作为我正在尝试做的事情的示例。

当我运行这个时：

data("HealthInsurance", package = "AER")
mod <- glm(insurance ~ ethnicity*health + age + married + family + selfemp + region, 
       family = binomial(link = "logit"),
       data = HealthInsurance)

avg_predictions(mod, by = c("ethnicity", "health"), type = "response")

我得到以下信息：

ethnicity health Estimate Std. Error     z Pr(>|z|)     S 2.5 % 97.5 %
 cauc     yes    0.816    0.00448 181.9   <0.001   Inf 0.807  0.824
 afam     yes    0.764    0.01292  59.2   <0.001   Inf 0.739  0.790
 cauc     no     0.725    0.01896  38.2   <0.001   Inf 0.688  0.762
 afam     no     0.727    0.04298  16.9   <0.001 210.9 0.643  0.812
 other    yes    0.752    0.02252  33.4   <0.001 808.6 0.707  0.796
 other    no     0.771    0.06839  11.3   <0.001  95.6 0.637  0.905
Columns: ethnicity, health, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high  
Type:  response

例如，我想估计预测概率的以下差异：

avg_predictions(mod, by = c("ethnicity", "health"), type = "response",
            hypothesis = c("b1-b2=0", "b1-b5=0", "b3-b4=0", "b3-b6=0"))

产生这个：

    Term Estimate Std. Error      z Pr(>|z|)    S   2.5 % 97.5 %
 b1-b2=0  0.05134     0.0137  3.754  < 0.001 12.5  0.0245 0.0781
 b1-b5=0  0.06406     0.0230  2.789  0.00528  7.6  0.0190 0.1091
 b3-b4=0 -0.00202     0.0470 -0.043  0.96570  0.1 -0.0941 0.0901
 b3-b6=0 -0.04618     0.0710 -0.651  0.51526  1.0 -0.1853 0.0929
Columns: term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 
Type:  response

但是，我不想使用参数位置，而是想做这样的事情（这不起作用）：

avg_predictions(mod, by = c("ethnicity", "health"), type = "response",
            hypothesis = c("`cauc yes` - `afam yes` = 0"))

我从跑步中得到了术语名称：

avg_predictions(mod, by = c("ethnicity", "health"), type = "response") |> tidy()

产生这个：

    # A tibble: 6 × 10
  ethnicity health estimate std.error statistic   p.value s.value conf.low conf.high term     
  <fct>     <fct>     <dbl>     <dbl>     <dbl>     <dbl>   <dbl>    <dbl>     <dbl> <chr>    
1 cauc      yes       0.816   0.00448     182.  0           Inf      0.807     0.824 cauc yes 
2 afam      yes       0.764   0.0129       59.2 0           Inf      0.739     0.790 afam yes 
3 cauc      no        0.725   0.0190       38.2 0           Inf      0.688     0.762 cauc no  
4 afam      no        0.727   0.0430       16.9 3.17e- 64   211.     0.643     0.812 afam no  
5 other     yes       0.752   0.0225       33.4 3.74e-244   809.     0.707     0.796 other yes
6 other     no        0.771   0.0684       11.3 1.64e- 29    95.6    0.637     0.905 other no

不确定我可能做错了什么。如有任何帮助，我们将不胜感激！

Answer 1

目前没有内置的方法可以做你想做的事。

变得超级明确的一种方法是定义自定义对比函数。请参阅下面粘贴的示例和此处的文档：https://marginaleffects.com/bonus/hypothesis.html#functions

或者，我鼓励您查看

avg_comparisons()

函数。这会产生不同的数量，但它可能（或可能不是）实际上是您正在寻找的。

library(marginaleffects)

data("HealthInsurance", package = "AER")
mod <- glm(insurance ~ ethnicity * health + age + married + family + selfemp + region,
  family = binomial(link = "logit"),
  data = HealthInsurance)

hyp <- function(x) {
  x$term <- paste(x$ethnicity, x$health)
  val <- c(
    x$estimate[x$term == "other yes"] - x$estimate[x$term == "other no"],
    x$estimate[x$term == "afam yes"] - x$estimate[x$term == "afam no"]
  )
  lab <- c("(other yes) - (other no)", "(afam yes) - (afam no)")
  out <- data.frame(term = lab, estimate = val)
  return(out)
}

avg_predictions(mod, by = c("ethnicity", "health"), hypothesis = hyp)
#> 
#>                      Term Estimate Std. Error      z Pr(>|z|)   S  2.5 % 97.5 %
#>  (other yes) - (other no)  -0.0199     0.0720 -0.277    0.782 0.4 -0.161  0.121
#>  (afam yes) - (afam no)     0.0370     0.0449  0.823    0.410 1.3 -0.051  0.125
#> 
#> Type:  response 
#> Columns: term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high

如何在avg_predictions()或hypotheses()函数中使用参数术语名称？

问题描述投票：0回答：1

1个回答

最新问题

如何在avg_predictions()或hypotheses()函数中使用参数术语名称？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1