Prediction of category proportions under area-level compositional mixed models


Miscellanenous (Nonignorability, Measurement Error, Errors in Sampling Variance, Multiple-category Outcome )

María Dolores Esteban (Universidad Miguel Hernandez de Elche)
María José Lombardía Cortiña (Departamento de Matemática, Universidade da Coruña, España)
Esther López Vizcaíno (Instituto Galego de Estatística, Spain)
Domingo Morales (Universidad Miguel Hernández de Elche, Spain) (Speaker)

Compositional data analysis deals with vectors, called compositions, with nonnegative elements representing proportions or counts of some partition of a given population that fulfil a size constraint. Many surveys have categorical variables that produce compositional data after calculating the direct weighted estimators of the domain totals or proportions of categories. For the case of a classification variable with four categories, a trivariate Fay-Herriot model is fitted to the additive log-ratio transformation of the compositions and the corresponding EBLUPs are derived. The model-based estimators of proportions are obtained by applying the inverse additive logistic transformation to the vectors of EBLUPs. This approach allows positive and negative correlations between the direct estimators of the category proportions. This is an advantage with respect to the multinomial mixed models that only permit negative correlations. The mean squared errors of the proposed predictors of category proportions are estimated by parametric bootstrap. The communication presents the required mathematical derivations and the results of three simulation experiments. Simulation 1 investigates the behaviour of the REML estimators of the model parameters (regression parameters and six variance components). Simulation 2 studies the performance of the proposed predictors of category proportions and makes comparisons with the predictors derived under multinomial mixed models. Simulation 3 analyses the implemented parametric bootstrap procedure. Finally, an application to data from a survey on health conditions in the region of Galicia (north-west of Spain) is given.