模拟异方差

Question

我正在尝试模拟具有不同程度异方差的简单回归模型。我想做一个简单的模拟，以显示增加的异方差如何影响模型的各个部分（系数与标准误差等）。这是到目前为止的代码：

N <- 1e4
x <- rnorm(N, 10, 1)

results <- data.frame(i = numeric(),
                      beta = numeric(),
                      se = numeric(),
                      var_y = numeric())

for (i in 1:100) {
  y <- rnorm(N,.7*x, i/x)
  
  fit <- lm(y ~ x)
  r <- c(i = i, beta = coef(fit)[[2]], se = summary(fit)$coefficients[, 2][[2]], var_y = var(y))
  results <- rbind(results, r)
}

colnames(results) <- c("i", "beta", "se", "var_y")
plot(results$beta)
plot(results$se)
plot(results$var_y)

问题是我担心我不仅引入了异方差性，而且还增加了 y 的整体方差，这本身就会影响模型结果。

有没有什么方法可以在不改变其他任何东西的情况下增加异方差的量？我是否想得太多并在模型内标准化 y （所以运行

lm(scale(y) ~ x)

）就足够了？

Answer 1

也许像这样？

n <- 1e4

set.seed(42)
x <- runif(n, 0, 100)
sd <- 0.1
y1 <- 10 + 0.1 * x + rnorm(n, sd = sd)

var(y1)
#[1] 8.458331

summary(mod1 <- lm(y1 ~ x))
# Coefficients:
#              Estimate Std. Error t value Pr(>|t|)    
# (Intercept) 1.000e+01  1.996e-03    5010   <2e-16 ***
# x           9.998e-02  3.456e-05    2892   <2e-16 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
# Residual standard error: 0.1005 on 9998 degrees of freedom


const <- 0.5
power <- 2

#this is the function from nlme::varConstPower, other functions are possible, e.g., help("varClasses", package = "nlme)
var2 <- (const + abs(x)^power)^2
var2 <- var2/mean(var2) * sd^2
y2 <- 10 + 0.1 * x + rnorm(n, sd = sqrt(var2))

var(y2)
#[1] 8.46444
#Note that this derived from the sample. 
#You should care more about the true variance (or the residual standard error below)


summary(mod20 <- lm(y2 ~ x))
# Coefficients:
#              Estimate Std. Error t value Pr(>|t|)    
# (Intercept) 1.000e+01  2.006e-03    4984   <2e-16 ***
# x           1.000e-01  3.474e-05    2879   <2e-16 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
# Residual standard error: 0.101 on 9998 degrees of freedom

library(nlme)
summary(mod21 <- gls(y2 ~ x, weights = varConstPower(form = ~ x)))
#     const     power 
# 0.4123067 1.9807457 
# 
# Coefficients:
#                 Value    Std.Error t-value p-value
# (Intercept) 10.000000 1.750794e-06 5711695       0
# x            0.100003 2.755056e-06   36298       0

模拟异方差

问题描述投票：0回答：1

1个回答

最新问题

模拟异方差

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1