An anticipated variance approach to estimating sampling variances for small area estimation problems


Session:

Accepted presentations cancelled by authors

Author: Carolina Franco (Center for Statistical Research and Methodology (CSRM), U.S. Census Bureau)
Abstract:

Area-level small area estimation models typically use both the point estimates from a survey and their corresponding design-based sampling variances as inputs. While the point estimates themselves are carefully modeled, the estimated sampling variances are often assumed known in small area models. In practice, for areas with small effective sample sizes the accuracy of design-based estimates of variance is as tenuous as that of the point estimates themselves. Bell (2008) shows how errors in the sampling variances used in small area models can affect the small area predictions and result in biases in the reported mean squared error. To address this, some authors have proposed modifications to the mean squared error estimates to reflect the added uncertainty due to estimation of the sampling variance (Wang and Fuller 2003, Rivest and Vandal 2003). Others have attempted to jointly model the small area estimates and the sampling variances (You and Chapman 2006, Sugasawa, et al. 2016). Yet others have tried to obtain smoother estimates of the sampling variance through the use of Generalized Variance Functions (GVFs., i.e. Valliant 1987, Otto and Bell, 1995, Hawala and Lahiri 2010, Maples 2016). In this paper, we attempt to improve sampling variance estimates by adopting the anticipated variance approach (Isaki and Fuller, 1982). The idea is to take expectations with respect to both the sampling design and the superpopulation model when computing the variance of the design-based estimator. We adopt simple modeling assumptions for the superpopulation model that also incorporate covariates from administrative records, and we derive expressions for the small area variances that can be estimated from the unit-level data and the available area level covariates. These expressions involve parameters common to all areas that can be estimated from the entire dataset, resulting in synthetic estimates of the sampling variances. We compare the resulting estimates with the design-based variance estimates and with estimates derived from GVFs, using data from the American Community Survey (ACS) and administrative records to produce sampling variances estimates that could be used for the estimation of school-aged children in poverty by the Small Area Income and Poverty Estimates (SAIPE) Program. We study the impact of using these variance estimates in simple small area models.

References:
Bell, W, R. (2008). Examining sensitivity of small area inferences to uncertainty about sampling error variances. In JSM Proceedings, Survey Research Methods Section. Alexandria, VA: American Statistical Association, 327–334. Hawala, S. and Lahiri, P. (2010). Variance modeling in the U.S. Small Area Income and Poverty Estimates Program. In JSM Proceedings, Survey Research Methods Section. Alexandria, VA: American Statistical Association. Isaki, C. T. and Fuller, W. A. (1982). Survey design under the regression superpopulation model. Journal of the American Statistical Association, 77, 377, 89-96. Maples, J. (2016). Estimating design effects in small areas/domains through aggregation. In JSM Proceedings, Survey Research Methods Section. Alexandria, VA: American Statistical Association Otto, M. C. and Bell, W. R. (1995). Sampling error modeling of poverty and income statistics for states. In JSM Proceedings, Government Statistics Section. Alexandria, VA: American Statistical Association, 160-165. Rivest, L. and Vandal, N. (2003). Mean squared error estimation for small areas when the small area variances are estimated. In Proceedings of the International Conference on Recent Advances in Survey Sampling, ed. J.N.K. Rao. Sugasawa, S., Tamae, H., and Kubokawa, T. (2016). Bayesian estimators for small area models shrinking both means and variances. Scandinavian Journal of Statistics, 44, 1, 150-167. Valliant, R. (1987). Generalized variance functions in stratified two-stage sampling. Journal of the American Statistical Association, 82, 499-508. Wang, J. and Fuller, W. A. (2003). The mean squared error of small area predictors constructed with estimated error variances. Journal of the American Statistical Association, 98,716-723. You, Y. and Chapman, B. (2006). Small area estimation using area level models and estimated sampling variances. Survey Methodology, 32, 97-103.