（零膨胀）具有康利标准误差和权重的负二项式回归

Question

我的目标是使用康利标准误差和权重进行负二项式回归。下面我分别询问零膨胀负二项式模型。

据我所知，可以使用 glm.nb() 和 fenegbin() 运行负二项式回归。在第一个中，我可以使用权重，在第二个中我可以实现康利标准误差。反之亦然，我目前无法应用相应的选项。

（也许相关：权重是实际应用中的熵平衡权重。）

代表：

library(MASS)
library(modelsummary)
library(fixest)

# Step 2: Create a dummy dataframe
set.seed(123)  # For reproducibility
n <- 100  # Number of observations

# Dummy data
data <- data.frame(
  y = rnbinom(n, size = 1, mu = 10),  # Negative binomial response variable
  x1 = rnorm(n),  # Continuous predictor
  x2 = rbinom(n, 1, 0.5),  # Binary predictor
  weight = runif(n, 0.5, 2),  # Weights
  lat = runif(n, -90, 90),  # Latitude
  lon = runif(n, -180, 180)  # Longitude
)

我有：

m1 <- glm.nb(y ~ x1 + x2, data = data, weights = weight)

...我找不到如何应用康利标准错误。

或者，我有：

m2 <- fenegbin(y ~ x1 + x2, data=data,
                vcov=vcov_conley(cutoff=100, lat="lat",lon="lon"))

...权重似乎不可用。

我也尝试过

m3a <- fenegbin(y ~ x1 + x2, data=data,
             vcov=vcov_conley(cutoff=100, lat="lat",lon="lon"),
             only.env=TRUE)
m3 <- est_env(m3a, weights = data$weight)

...贯穿但不改变结果。出了点问题。

我在模型摘要中报告（并随后导出）结果。

modelsummary(list(m1,m2,m3),stars=T)

...我知道“vcov”选项非常通用。然而，我找不到康利标准错误的选项。

我还需要这个用于零膨胀负二项式（zinb）模型。该模型看起来像这样，缺少康利标准误差：

library(pscl)
m4 <- zeroinfl(y ~ x1 + x2 , data=data, dist="negbin", weights=weight)

非常感谢任何帮助！

Answer 1

据我所知，我认为使用基础 R 或我所知道的任何软件包不可能实现你想要的。当然，这并不意味着我没有忽略一些事情。正如您所发现的：

```
MASS::glm.nb()
```
支持权重，但不支持康利标准误。
```
fixest::fenegbin()
```
支持康利标准误，但不支持权重。
```
pscl::zeroinfl()
```
支持权重但不支持康利标准误。

...没有一个能满足您的要求。其中一种可能性是使用模仿 Conley SE 属性的标准错误，例如使用

clubSandwich

包。另一种可能性是为 Conley SE 实现您自己的计算。我摆弄着尝试做到这一点，但我并没有走得太远，而且目前我没有时间做更多的事情。这是我所拥有的。也许你或其他人可以更进一步。一般做法是：

根据空间坐标计算观测值之间的成对距离，以获得距离矩阵
使用截止距离来确定哪对观测值应相互影响。
使用空间权重调整经验估计函数。

我犹豫是否要发布此代码，因为它无法按预期工作，但无论如何我都会继续并希望有人发现它有用：

数据应如下所示：

> head(data)
   y         x1 x2    weight        lat         lon
1  3  0.8793641  0 1.4801417   8.645812   -7.786999
2 22  0.9231292  0 1.8739795 -64.683686  101.508809
3 10  0.6931340  1 1.7196697 -59.441571 -163.430443
4  2 -0.1395003  0 1.5138350  47.157358  115.144080
5 10 -3.3106787  0 1.7112487   4.931089  -83.012959
6 18  0.5384658  0 0.6920966  64.978084  -78.173851

# Function to compute Conley standard errors
conley_se <- function(model, lat, lon, cutoff, data) {
    #   model  : A fitted model object (glm or glm.nb object)
    #   lat    : The name of the latitude variable 
    #   lon    : The name of the longitude variable 
    #   cutoff : The distance cutoff for determining spatial correlation
    #   data   : A data frame containing the model variables and coordinates 

  coordinates <- cbind(data[[lat]], data[[lon]])
  
  # Calculate pairwise distances between observations
  dist_matrix <- sp::spDists(coordinates, longlat = TRUE)
  
  # Apply cutoff distance to create weight matrix
  weight_matrix <- (dist_matrix <= cutoff) * 1
  
  # Compute the empirical estimation function (scores)
  estfun <- sandwich::estfun(model)
  
  # Compute the 'meat' of the sandwich estimator
  meat <- crossprod(estfun) / nrow(estfun)
  
  # Compute the 'bread' of the sandwich estimator
  bread <- sandwich::bread(model)
  
  # Initialize the HAC covariance matrix
  vcov_hac <- bread %*% meat %*% bread
  
  # Adjust the covariance matrix for spatial correlation
  # This loop incorporates the spatial weights based on distances
  for (i in 1:nrow(weight_matrix)) {
    for (j in 1:ncol(weight_matrix)) {
      if (i != j && weight_matrix[i, j] != 0) {
        vcov_hac <- vcov_hac + weight_matrix[i, j] * outer(estfun[i, ], estfun[j, ])
      }
    }
  }
  
  # Return the adjusted covariance matrix
  return(vcov_hac)
}

用途：

# Fit negative binomial model with weights
m1 <- glm.nb(y ~ x1 + x2, data = data, weights = weight)

# Compute Conley standard errors
conley_vcov <- conley_se(m1, "lat", "lon", 100, data)

# Summary with custom Conley standard errors
summary1 <- coeftest(m1, vcov = conley_vcov)

（零膨胀）具有康利标准误差和权重的负二项式回归

问题描述投票：0回答：1

1个回答

最新问题

（零膨胀）具有康利标准误差和权重的负二项式回归

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1