如何生成数据以显示分位数回归与线性回归之间的差异

问题描述 投票:0回答:1

我想生成数据以显示分位数回归如何有用,特别是当线性回归显示两个变量之间没有(均值)相关性时,则在较高和/或较低的数据分位数中存在相关性。

可以在下图中找到此行为:

changes in cutthroat trout density (y) as a function of the ratio of stream width to depth (X)

但是我想用对我的研究领域有意义的数据来重现这种行为,而这恰好是住房经济学。

r linear-regression simulation data-generation quantreg
1个回答
0
投票

我想我找到了一种方法。我不知道这是否是最好的方法,但是我做到了:使用truncnorm包,可以创建被截断的普通数据,通过它我可以创建此图,这对我来说似乎很好。

set.seed(5)
x = rtruncnorm(500, a = 125, b = 600, mean = 360, sd = 100)
b0 = 2500 # intercept chosen at your choice
b1 = 0 # coef chosen at your choice
h = function(x) 250 - .25*x # h performs heteroscedasticity function (here I used a linear one)
i = function(x) b0 + .25*x # lower bound limit
y = rtruncnorm(500, a = i(x), mean = b0, sd = h(x))
#y = b0 + b1*x + eps - h(x)
dados <- data.frame(VU = y, Area = x)
#X <- X[which(X[, "y"] > 2400), ]
fit <- lm(VU ~ Area, data = dados)
fit1 <- rq(VU ~ Area, tau = .10, data = dados)
fit2 <- rq(VU ~ Area, tau = .25, data = dados)
fit3 <- rq(VU ~ Area, tau = .50, data = dados)
fit4 <- rq(VU ~ Area, tau = .75, data = dados)
fit5 <- rq(VU ~ Area, tau = .90, data = dados)
fit6 <- rq(VU ~ Area, tau = .95, data = dados)
par(mar=c(5,5,2,6), bg = "light blue")
plot(VU ~ Area, data = dados, main = "A importância de analisar todos os quantis",
     xlab = expression("Área"~(m^2)),
     ylab = expression("VU (R$/"~m^2~")"))
abline(fit, lty = 2, lwd = 2)
abline(fit1, col = "yellow")
abline(fit2, col = "red")
abline(fit3, col = "green")
abline(fit4, col = "blue")
abline(fit5, col = "darkred")
abline(fit6, col = "darkblue")
legend('topleft', xpd = TRUE, inset=c(1,0), bty = "n",
       legend=c(".10", ".25", ".50", ".75", ".90", "0.95", "mean"),
       col = c("yellow", "red", "green", "blue", "darkred", "darkblue", "black"), 
       lty = c(rep(1, 6), 2),
       lwd = c(rep(1,6), 2),
       title= expression(~tau), text.font=4, bg='lightblue')
© www.soinside.com 2019 - 2024. All rights reserved.