子群的分位数估计，其中某些子群仅使用srvyr和调查R包有一个案例

Question

我正在尝试得出一系列子组连续变量第25个百分位数的估计，其中数据来自使用抽样权重的调查。我在R中使用survey和srvyr包进行此操作。

我面对的这个问题是，在少数情况下，一个亚组只有一个观察值，因此，第25个百分位是没有意义的。这样做会很好，但是会导致出现错误，从而阻止在具有足够观察力的情况下为那些子组计算百分位数。

Error in approxfun(cum.w, xx[oo], method = method, f = f, yleft = min(xx),  : 
  need at least two non-NA values to interpolate

该代码在删除有问题的组时运行，但是我不得不手动识别它们，这远非理想。

是否有一种方法可以达到相同的结果，但是对于单个观察组，输出NA或仅输出该观察值而不是错误？另外，是否有一种巧妙的方法可以自动从计算中排除此类组？

下面是可重现的示例，使用survey包中的apistrat数据集来说明我的问题。

library(dplyr)
library(survey)
library(srvyr)
data(api)

#25th percentile of api00 by school type and whether school is year round  or not
apistrat %>% 
  as_survey(strata = stype, weights = pw) %>%
  group_by(yr.rnd, stype, .drop=TRUE) %>%
  summarise(survey_quantile(api00, 0.25, na.rm=T))

#Error in approxfun(cum.w, xx[oo], method = method, f = f, yleft = min(xx),  : 
#need at least two non-NA values to interpolate

apistrat %>% group_by(yr.rnd, stype) %>% tally() %>% filter(n==1)
#one group out of 6 has only a single api00 observation and therefore a quantile can't be interpolated

#Removing that one group means the code can now run as intended
apistrat %>% 
  as_survey(strata = stype, weights = pw) %>%
  filter(!(yr.rnd=="Yes"&stype=="H")) %>%
  group_by(yr.rnd, stype, .drop=TRUE) %>%
  summarise(survey_quantile(api00, 0.25, na.rm=T))

#Get the same error if you do it the 'survey' package way
dstrat <- svydesign(id=~1,strata=~stype,data=apistrat, fpc=~fpc)
svyby(~api99, ~stype+yr.rnd, dstrat, svyquantile, quantiles=0.25)

Answer 1

一种解决方法是使用svyquantile()将呼叫包装到tryCatch()。>

> svyq<-function( ...){tryCatch(svyquantile(...), error=function(e) matrix(NA,1,1))}
> svyby(~api99, ~stype+yr.rnd, dstrat, svyq, quantiles=0.25,keep.var=FALSE,na.rm=TRUE)
      stype yr.rnd statistic
E.No      E     No    560.50
H.No      H     No    532.75
M.No      M     No    509.00
E.Yes     E    Yes    456.00
H.Yes     H    Yes        NA
M.Yes     M    Yes    436.00
使用分位数和svyby，您需要明确是否需要标准误差-上面的代码不需要。如果需要标准错误，则需要error = tryCatch分支以返回其中包含NA的实际的准量化对象。

子群的分位数估计，其中某些子群仅使用srvyr和调查R包有一个案例

问题描述投票：0回答：1

1个回答

最新问题

子群的分位数估计，其中某些子群仅使用srvyr和调查R包有一个案例

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1