A two-level hybrid calibration technique for small area estimation


Session:

Benchmarking

Authors:
Risto Lehtonen (University of Helsinki) (Speaker)
Ari Veijanen (Statistics Finland)
Abstract:

Calibration constitutes a flexible tool for design-based inference for finite populations. We introduce here a new calibration method we call two-level hybrid calibration. Hybrid calibration (Lehtonen and Veijanen 2015) combines some of the favorable properties of model-free calibration (Deville and Särndal 1992) and model calibration (Wu and Sitter 2001). Benefits of model-free calibration are the ability to reproduce the known (published) official figures of the auxiliary variables (internal benchmarking property) and the fact that aggregate-level auxiliary data only are needed. Calibrated weights do not depend on any study variable, which allows the use of the same weights to a set of different study variables. These properties are appreciated in official statistics, and model-free calibration has been routinely used in official statistics for the production of reliable direct estimates of totals and means for population subgroups (domains) whose sample sizes are large enough. However, the method tends to fail in small area estimation because estimates become unstable when domain sample sizes are too small (e.g. Hidiroglou and Estevao 2016). From modelling point of view, model-free calibration is best justified for continuous study variables under an implicit linear model.

Model calibration allows flexible modeling for various study variable types, including binary, polytomous and count variables, and concentrates to efficiency improvement by careful model choice. In model calibration, the population totals of predictions from the model are produced for the domains, instead of the population totals of the auxiliary variables as in model-free calibration. Thus, the built-in benchmarking property of model-free calibration is lost in model calibration. In addition, a separate set of weights must be derived for each study variable. Montanari and Ranalli (2009) presented a multiple model calibration method to overcome these restrictions.

Lehtonen and Veijanen (2016) developed new variants of model calibration for small area estimation, including semi-direct and semi-indirect model calibration estimators assisted by linear and logistic mixed models. They showed that model calibration offers improved efficiency over model-free calibration for small domains in particular, where direct model-free calibration estimates become unstable. The method of hybrid calibration incorporates the benchmarking property of model-free calibration into a calibration procedure, still retaining flexibility in model choice and efficiency gain of model calibration. In hybrid calibration, a set of auxiliary variables in addition to predictions from the model are inserted in the calibration vector, constituting the model-free calibration part and the model-assisted calibration part of the calibration vector. In practical situations (e.g. Lehtonen and Veijanen 2015), hybrid calibration typically outperforms model-free calibration in efficiency but does not necessarily outperform model calibration. This is the price to be paid for the benchmarking property. In situations where the model-free part of the calibration procedure involves instability in domain estimates, which can happen for small domains, the new two-level hybrid calibration method tends to reduce instability and improve efficiency. In this method, model calibration operates at the original domain level (e.g. NUTS4) but the model-free calibration part is defined at a higher hierarchical level (e.g. NUTS3).

In the paper we derive the two-level hybrid calibration equations and compare the statistical properties (bias, accuracy) of the method with other methods by using design-based simulation experiments with real data obtained from statistical registers of Statistics Finland. Our subject matter area is in the estimation of poverty rate for regional domains. We use logistic mixed models as assisting models in the model-assisted calibration procedures.

References:
Deville J.-C. and Särndal C.-E. (1992) Calibration estimators in survey sampling. JASA 87, 376-382. Hidiroglou M.A. and Estevao V.M. (2016) A comparison of small area and calibration estimators via simulation. Joint Issue of Statistics in Transition and Survey Methodology,17, 133–154. Lehtonen R. and Veijanen A. (2016) Model-assisted methods for small area estimation of poverty indicators. In: Pratesi M. (Ed.) Analysis of Poverty Data by Small Area Estimation. Chichester: Wiley, 109-127. Lehtonen R. and Veijanen A. (2015) Small area estimation by calibration methods. WSC of the ISI, Rio de Janeiro. Montanari G.E. and Ranalli M.G. (2009) Multiple and ridge model calibration. Proceedings of Workshop on Calibration and Estimation in Surveys, Statistics Canada. Wu C. and Sitter R.R. (2001) A model-calibration approach to using complete auxiliary information from survey data. JASA 96, 185–193.
Keywords: Model-free calibration, Model calibration, Assisting model, Poverty, Simulation