是否存在创建置信区间的现有函数 来自
svyby
比例对象(在我的例子中是 survey
包中二进制项目的交叉表)。我经常比较各组之间的比例,如果有一个可以提取置信区间的函数(使用调查函数 svyciprop
而不是 confint
)会非常方便。下面的示例显示了我想要实现的目标。
加载数据
library(survey)
library(weights)
data(api)
apiclus1$both<-dummify(apiclus1$both)[,1]#Create dummy variable
dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)
创建一个 svyby 对象,比较 stype 中变量“both”的比例
b<-svyby(~both, ~stype, dclus1, svymean)
confint(b)#This works, but svyciprop is best in other cases, especially when proportion is close to 0 or 1
svyciprop(b)#This requires that you specify each level and a design object
是否可以创建一个函数(例如
byCI(b,method="likelihood")
,其实现与 confint(b)
相同但使用 svyciprop
?它基本上必须遍历 svyby
对象的每个级别并创建一个置信区间。我的到目前为止尝试都没有成功。
可能还有另一种方法可以解决这个问题,但我喜欢使用
svyby()
,因为它快速且直观。
svyby()
有一个 vartype=
参数来指定您希望如何指定采样不确定性。使用 vartype="ci"
获取置信区间,例如
svyby(~I(ell>0),~stype,design=dclus1, svyciprop,vartype="ci",method="beta")
很容易检查这是否与手动完成每个级别相同,例如,
confint(svyciprop(~I(ell>0), design=subset(dclus1,stype=="E"),method="beta"))
有趣..这两个命令不应给出相同的结果..第一个命令可能会抛出错误或警告:
svyby( ~both , ~stype , dclus1 , svyciprop , method = 'likelihood' )
svyby( ~both , ~stype , dclus1 , svymean )
您可能需要提醒 Lumley 博士注意这个问题 -
surveyby.R
第80行附近的代码可能可以稍微修改一下,以使svyciprop
也可以在svyby
内部工作..
但我可能忽略了一些东西(他可能已经在文档中的某个地方注意到了),
所以在联系他之前请务必仔细阅读所有内容
无论如何,这是一个可能解决您问题的临时解决方案
# create a svyby-like function specific for svyciprop
svyciby <-
function( formula , by , design , method = 'likelihood' , df = degf( design ) ){
# steal a bunch of code from the survey package's source
# stored in surveyby.R..
byfactors <- model.frame( by , model.frame( design ) , na.action = na.pass )
byfactor <- do.call( "interaction" , byfactors )
uniquelevels <- sort( unique( byfactor ) )
uniques <- match( uniquelevels , byfactor )
# note: this may not work for all types..
# i only tested it out on your example.
# run the svyciprop() function on every unique combo
all.cis <-
lapply(
uniques ,
function( i ){
svyciprop(
formula ,
design[ byfactor %in% byfactor[i] ] ,
method = method ,
df = df
)
}
)
# transpose the svyciprop confidence intervals
t.cis <- t( sapply( all.cis , attr , "ci" ) )
# tack on the names
dimnames( t.cis )[[1]] <- as.character( sort( unique( byfactor ) ) )
# return the results
t.cis
}
# test out the results
svyciby( ~both , ~stype , dclus1 , method = 'likelihood' )
# pretty close to your b, but not exact (as expected)
confint(b)
# and this one does match (as it should)
svyciby( ~both , ~stype , dclus1 , method = 'mean' , df = Inf )
不幸的是,我无法重现建议的答案。
但是,我创建自定义函数来实现此目的。
# create a svyby-like function specific for svyciprop
svyciprop_by <- function(x, design, by, method) {
# extract the levels in by
by_var <- all.vars(by)[1]
by_data <- model.frame(by, data = design$variables)
by_levels <- unique(by_data[[by_var]])
# run the svyciprop() functions on each levels in by
calculate_ci <- function(stratum) {
subset_design <- subset(design,
design$variables[[by_var]] == stratum)
result <- svyciprop(x,
design = subset_design,
method = method,
df = degf(design))
return(attr(result, "ci"))
}
# tabulate the result
ci_results <- lapply(by_levels, calculate_ci)
results <- data.frame(subset = by_levels,
ci = do.call(rbind, ci_results))
return(results)
}
# example
svyciprop_by(x = ~both, design = dclus1,
by = ~stype, method = "xl")