过时的数据掩码迟迟无法解决

问题描述 投票:0回答:1

大家好,我有一个关于非标准评估的问题。我拟合了一些具有不同结果变量的模型,并像这样计算边际效应。

library(palmerpenguins)
library(marginaleffects)
library(sandwich)
library(tidyr)
library(dplyr)


long_pengs = penguins |>
  pivot_longer(cols = c(body_mass_g, flipper_length_mm),
               names_to = 'outcome',
              values_to = 'vals') |>
  drop_na(sex) |>
  summarise(mods = list(lm(vals ~ sex * bill_length_mm, data = pick(everything()))), .by = outcome)


comps = long_pengs |>
  rowwise(outcome) |>
  reframe(avg_comparisons(mods,
                          variables = 'sex',
                          subset(sex == 'female')))

但是,当我尝试集群引导标准错误时,我遇到了这些错误消息,并且想知道如何解决此问题。我不打算用非标准的评估来做这件事。


# works 
long_pengs |> 
  rowwise(outcome) |>
  reframe(avg_comparisons(mods,
    variables = 'sex',
    subset(sex == 'female')) |>
      inferences(method = 'rsample')) 

long_pengs |>
rowwise(outcome) |>
reframe(avg_comparisons(mods,
                        variables = 'sex',
                        subset(sex == 'female'),
                        vcov = vcovBS(mods, cluster = ~species)))
#> Error in `reframe()`:
#> ℹ In argument: `avg_comparisons(...)`.
#> ℹ In row 1.
#> Caused by error:
#> ! Obsolete data mask.
#> ✖ Too late to resolve `species` after the end of `dplyr::summarise()`.
#> ℹ Did you save an object that uses `species` lazily in a column in the
#>   `dplyr::summarise()` expression ?

long_pengs |> 
  rowwise(outcome) |>
  reframe(avg_comparisons(mods,
    variables = 'sex',
    subset(sex == 'female')) |>
      inferences(method = 'rsample', strata = species))
#> Error in `reframe()`:
#> ℹ In argument: `inferences(...)`.
#> ℹ In row 1.
#> Caused by error:
#> ! object 'species' not found

所需的输出看起来像这样

## desired output 

m1 = lm(body_mass_g ~ sex * bill_length_mm, data = penguins)

c1 = avg_comparisons(m1, variables = 'sex',
                     subset(sex == 'female'),
                    vcov = vcovBS(m1, cluster = ~species))

m2 = lm(flipper_length_mm ~ sex * bill_length_mm, data = penguins)

c2 = avg_comparisons(m2, variables = 'sex',
                     subset(sex == 'female'),
                     vcov = vcovBS(m2, cluster = ~species))

rbind(c1, c2)
#> 
#>  Estimate Std. Error     z Pr(>|z|)   S   2.5 % 97.5 %
#>   420.487     293.10 1.435    0.151 2.7 -153.97  994.9
#>     0.392       5.16 0.076    0.939 0.1   -9.73   10.5
#> 
#> Term: sex
#> Type:  response 
#> Comparison: mean(male) - mean(female)
#> Columns: term, contrast, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted

创建于 2024 年 11 月 13 日,使用 reprex v2.1.1

r dplyr non-standard-evaluation r-marginaleffects
1个回答
0
投票

你可以这样构建它,它将避免捕获有问题的环境东西......希望有人能及时发布更优雅的解决方案。请注意,我使用

library(purrr)
是为了方便迭代

long_pengs0 = penguins |>
  pivot_longer(cols = c(body_mass_g, flipper_length_mm),
               names_to = 'outcome',
               values_to = 'vals') |>
  drop_na(sex) |> split(~outcome ) |> map(\(x)
    lm(vals ~ sex * bill_length_mm,data=x))
  
long_pengs <- imap_dfr(long_pengs0,\(x,y){
  tibble(
    outcome=y,
    mods = list(x)
  )
})
© www.soinside.com 2019 - 2024. All rights reserved.