In this paper, we compare the accuracy of the Monte Carlo Markov Chain (MCMC) and Adjusted Density Method (ADM) in approximating Hierarchical Bayesian (HB) estimates in the context of small area estimation. We apply a three-level hierarchical model to poverty data from the 2005 American Community Survey and experiment with a flat improper prior. Both MCMC and […]

# Archives: Presentations

### Small Area Estimation for Repeated Survey and Positively Skewed Distributions

This paper focuses on small area estimation for repeated survey data with positively skewed distribution. Nowadays, the demand for small area statistics which utilize cross-sectional and repeated measures data is increasing. For instance, Indonesian National Socio-Economic Survey is conducted twice a year. In our study, we consider repeated survey including the same census blocks at […]

### Corrected confidence intervals for a small area mean based on the weighted estimator with fixed weights under the Fay-Herriot model.

There is a growing demand on small area estimates for policy and decision making, local planning and fund distribution. The best known small area estimation model, which is called the Fay-Herriot (FH) model is considered here. There is a situation in mixed model estimation that the estimates of the variance component of the random effect […]

### Small Area Estimation with Linked Data

Due to advances in computing, government agencies can process administrative records and link them with sample survey and census records for statistical purposes in a fraction of time and costs required for field data collection. The accessibility of different administrative data from different sources has brought new opportunities for statisticians to develop innovative small area […]

### Discussion

### Estimation of Design Mean Square Error in Small Area Estimation

Traditional MSE estimators in small area estimation account for all sources of variation, including in particular the distribution of the target population parameters (the random effects) under the assumed model. Often, however, users of the small area predictors prefer the MSE estimators to condition on the true unknown population values, such that they only account […]

### Discussion

### The use of the FH Model in entrepreneurial activity assessment

The paper presents attempts to use indirect estimation methodology – area-level time models and administrative data to estimate basic economic information about small business at a low level of aggregation. Economic policy requires current knowledge of the extent of this phenomenon. Such information is provided mainly through sample surveys conducted by among the others Central […]

### A Modified Pearson’s $\chi^{2}$ Test with Application to Generalized Linear Mixed Model Diagnostics

We propose a modified version of Pearson’s $\chi^{2}$ test for goodness-of-fit that is applicable to generalized linear mixed models (GLMMs) diagnostics. The proposed test is based on cell frequencies, which is natural for many cases of GLMM. The procedure is simple and does not involve generalized inverse of a matrix, as was used in a […]

### Survey-weighted Unit-Level SAE

Official statistic surveys often aim to provide reliable estimates for nation-wide population figures. These surveys often make use of complex survey designs to improve efficiency of the national estimates. However, when considering regional estimates, the sample sizes within the regions may be too low to apply classical design-based estimators. In this case, the small area […]

### Small area estimation of regional diabetes prevalence from biased health insurance data

Medical studies suggest that assessing disease distributions on regional levels is very important for planning and providing healthcare programmes. Data from national health surveys is frequently used in this context to obtain regional prevalence estimates. But these surveys often lack sufficient local observations due to limited ressources. Subsequently, regional prevalence estimates can be unreliable because […]

### Prediction of category proportions under area-level compositional mixed models

Compositional data analysis deals with vectors, called compositions, with nonnegative elements representing proportions or counts of some partition of a given population that fulfil a size constraint. Many surveys have categorical variables that produce compositional data after calculating the direct weighted estimators of the domain totals or proportions of categories. For the case of a […]

### Some local diagnostics for the Fay-Herriot model

The increasing need of information leads to the production of survey estimates for domains (e.g., regions) of various sizes. Thus, within the same survey, domain sample sizes can range from a couple of units to more than thousand units. As direct estimators (Horvitz-Thompson or calibration estimators) suffer from a lack of precision in small domains, […]

### A validation of 2011 population estimates for local authorities by ethnic groups in England using Generalised Structure Preserving Estimation (GSPREE) with multiple data sources

The Office for National Statistics is looking at using more administrative and survey data to produce typical census outputs, in an Administrative Data Census. However, survey data often have very small or null sample sizes within areas, and administrative data may not cover the entire population. The census will continue to be a detailed source […]

### Small Area specific estimators that borrow strength across areas and across time

In this work a Small Area-specific estimation approach that borrows strength across areas and across time is presented to obtain Labor Force Estimators by economic activity. Several small area model-based estimators are considered, which are derived from additive regression models based on auxiliary information, with and without random effects. Often, for a given area and […]

### Benchmark estimators for a small area mean under a one-fold nested regression model

The authors develop a number of small area estimation procedures using a unit level linear regression model and survey weights: these weights incorporate the auxiliary information at the sample level. In particular, they propose three ways to ensure that the You-Rao (2002), Prasad-Rao (1999) and EBLUP small area estimators add up to estimates over the […]

### Beta regression models for small area estimation of proportions

Linear mixed effects models have been popular in small area estimation problems for modeling survey data when the sample size in one or more areas is too small for reliable inference. However, when the data are restricted to a bounded interval, the linear model may be inappropriate, particularly if the data are near the boundary. […]

### Estimating wealth distributions using Statistics Canada’s small-area estimation tool

In recent years, there has been increased interest in data about the distribution of wealth. To address this need, Statistics Canada is building a series of annual tables integrating macro-level National Accounts data with micro-level survey data on wealth. This product adds a distributional component to Canada’s macroeconomic accounts, thereby giving a more complete picture […]

### Nested Error Regression Models with Missing Values and Non-ignorable Non-response.

In most small area estimation problems there is a small sample size in each small area or segment. When the data have a two-level hierarchical structure, the nested error regression model proposed by Battese et al (1998) is a powerful tool for building a stable predictor. This model assumes that the sample is drawn at […]

### A two-level hybrid calibration technique for small area estimation

Calibration constitutes a flexible tool for design-based inference for finite populations. We introduce here a new calibration method we call two-level hybrid calibration. Hybrid calibration (Lehtonen and Veijanen 2015) combines some of the favorable properties of model-free calibration (Deville and Särndal 1992) and model calibration (Wu and Sitter 2001). Benefits of model-free calibration are the […]

### Small area estimation based on quantile regression: a Bayesian approach

Quantile and M-quantile regression methods have been applied to small area estimation in several papers (we can quote Chambers and Tzavidis (2006) ; Chambers et al. (2014) among those). The main idea is that of using a semi-parametric regression model for quantiles, thus avoiding parametric distributional assumptions on regression’s residuals and random effects. Chambers and […]

### Mapping of Arsenic Contamination in Ground Water: A New Hierarchical Bayesian Method

Arsenic (As) is a toxic metal commonly found in groundwater in many countries. Long term exposure to arsenic in food and water has been cited as a major health hazard. The maximum level of arsenic considered safe, set by the World Health Organization (WHO), is 10 mg/L. This is just a guideline — many countries […]

### A Bayesian semi-parametric approach to small area estimation and forecasting: with application to estimating and forecasting mortality rates by country, age and sex.

Researchers and policymakers are often interested in estimating and forecasting rates cross-classified by several dimensions. We consider the case of simultaneously estimating and forecasting mortality rates, cross-classified by age, sex and country, for 40 countries in the Human Mortality Database. These rates have complicated interactions. For instance, age-sex profiles differ across countries, and are changing […]

### Bayesian Sample Size Determination for Planning Hierarchical Bayes Small Area Estimates

This paper devises a fully Bayesian sample size determination method for hierarchical model-based small area estimation with a decision risk approach. A new loss function specified around a desired maximum posterior variance target implements conventional official statistics criteria of estimator reliability (coefficient of variation of up to 20 per cent). This approach comes with an […]

### Design Based Approach for SAE with Gradient Boosted Models.

Weighted model estimates such as those from pseudo maximum likelihood (PML) are both model consistent and design consistent. Design-based Jackknife replication are sound in theory and flexible in practice for variance calculation. These pave a path to applying machine-learning in SAE in which model likelihood function may not be available. A major advantage of machine-learner […]

### Applying the small area estimation approach to multi-dimensional child poverty in Nigeria.

In this paper the Small Area Estimation approach (Rao, 2003) is used to estimate and analyse multi-dimensional Child Poverty for Local Government Areas (LGA) in Nigeria. There are 774 LGAs (the political-administrative level below the 36 States comprising the Nigerian Federation). This is the first ever estimate of Child Poverty with such detailed level of […]

### Investigating stability in confidence in a policing: a Bayesian spatiotemporal small area estimation approach.

Improving understanding of public confidence at the local level willbetter enable the police to conduct proactive confidence interventions to meet the concerns of local communities. Neighbourhood level approaches to modelling public confidence in the police are hampered by the small number problem and the resulting instability in the estimates and uncertainty in the results. Furthermore, […]

### Concomitant Variable Mixture Models for Small Area Estimation: An Application to Estimating Regional ARPRs in Germany

A basic assumption of standard small area models is that the statistic of interest can be modelled through a mixed model with common fixed effects for all areas under study. When modelling poverty through a set of social indicators, it might, however, be more realistic to assume that the exploited relationship between response variable and […]

### Small Area Estimation of the Relative Median Poverty Gap

The reduction of poverty in Europe is a milestone in the Europe 2020 strategy. The most frequently used indicator in Europe, the At Risk of Poverty Rate (ARPR), measures the share of people below the poverty threshold but it has a serious shortcoming: it neglects how the left tail of the income distribution is shaped […]

### Supplemental Poverty Measure: A Comparison of Geographic Adjustments with Regional Price Parities vs. Median Rents from the American Community Survey

Poverty statistics are used in the United States to evaluate national economic well-being as well as to compare economic well-being across states and major urban areas. Poverty estimates using the official poverty measure are used in formulas to distribute millions of dollars of federal anti-poverty funds but are based on thresholds that do not take […]

### Spatially Structured Sparseness in Bayesian Spatial Health Modeling

Often geospatial disease outcomes are characterized by sparseness when viewed as a count distribution. This is sometimes called zero-inflation. Classically the models for zero inflation are of two types: zero class modelled as ‘structural’ or ‘Poisson’ or as hurdle models where the zeroes are treated completely separately from the truncated Poisson positive counts. Often the […]

### Covariate Adjustment and Ranking Methods to Identify Regions with High and Low Mortality Rates

Identifying regions with the highest and lowest mortality rates and producing the corresponding color-coded maps help epidemiologists identify promising areas for analytic etiological studies. Based on a two-stage Poisson–Gamma model with covariates, we use information on known risk factors, such as smoking prevalence, to adjust mortality rates and reveal residual variation in relative risks that […]

### Dirichlet Process Priors for Small Area Estimation and Disease Mapping

We illustrate some applications of Dirichlet Process Priors for modeling random effects in small area models. We also show some of its application in disease mapping by using a Dirichlet process prior with a baseline CAR model. One advantage of such priors is that there is automatic clustering as well as tracking a multimodal posterior […]

### Estimation of small area means using area-level and unit-level covariates based on multiple surveys

Unit-level models are extensively used in small area estimation. These models incorporate both unit-level and area-level covariates to accurately estimate finite population means of small areas. To borrow information from the unit-level covariates, that are available only from the sampled units, we propose a multivariate adaptation of the nested error regression model. Information on the […]

### Small Area Estimation by Mass Imputation: Combining Information from Two Independence Surveys

Combining information from two independence surveys with similar measurement can be a promising area of research in small area estimation. To incorporate the survey specific effect, we use a random effect model in the population level. The sampling design can be informative in the sense that the sample distribution can be different from that of […]

### Robust prediction for small area

Influential units occur frequently in surveys, especially in the context of business surveys that collect economic variables whose distribution are highly skewed. A unit is said to be influential when its inclusion or exclusion from the sample has an important impact on the magnitude of survey statistics. Robust small area prediction has received a lot […]

### Estimation for a unit level model with measurement errors

Mixed effects regression models are widely used in small area estimation for linking areas and borrowing strength across domains but when the auxiliary information used in these is measured with error, the resulting estimators might be largely affected if the measurement error is ignored. Measurement error models are, in general, not identifiable and several methods […]

### A new variance component estimation for achieving multiple goals simultaneously

For the last several decades, area level models have played a critical role in the theory and practice of small area estimation. The implementation of an area level model does not require confidential micro data. Aggregate statistics are modeled and thus the chance of disclosing information about a given individual is minimal. Relatively easier accessibility of aggregate statistics […]

### The Data Privacy Problem: Computer Science, Statistics and Future Directions

Given a medical database, how does one allow access by medical researchers while preserving patient privacy? How about a similar dilemma in analysis in an employment discrimination legal case? Data privacy has been an area of active research from the 1980s to the present, in both the statistics and computer science communities. As in the machine learning case, computer scientists […]

### Synthetic Data Generation for Small Area Estimation with Application to Large-Scale Surveys

Small area statistics provide an important source of information used to study local trends related to social, health, and economic phenomena. However, most large-scale sample surveys, for which rigorous measures of these phenomena are collected, are not designed for purposes of producing reliable small area estimates. A further complication is that data disseminators are typically […]

### A unit-level multilevel model with survey weights and design effects for small area estimation of Health Insurance coverage using geocoded American Community Survey (ACS)

National demographic and health surveys have become routinely geocoded in federal statistical agencies in the United States. The geocoded surveys allow us to construct and fit appropriate unit-level multilevel models for small area estimation. We developed a unit-level multilevel model and poststratification (MRP) approach for small area estimation with geocoded ACS. The multilevel logistic model […]

### Assessing the quality of small area estimates for poverty rate in Poland using taxonomy analysis

Comprehensive and reliable assessment of the quality of estimates obtained using small area estimation methodology is one of the key challenges facing national statistical institutes. Indirect estimation theory provides many criteria for the statistical assessment of results and model diagnostics. They involve assessing relative estimation errors and relative bias, measures of the goodness of fit, […]

### Outliers in Small Area Estimations for Annual Survey of Public Employment & Payroll- Bayesian Approach

The presence of outliers in the small area estimation (SAE) raises concerns in the design based part of population parameters prediction due to the violation of model-based assumptions. Various techniques have been introduced to mitigate the effect of outliers in the unit-level and area-level models in the small area estimation in the literature. This paper […]

### Demographic Transition Estimating in Constructing New Cities By Using GIS Applications to Support Development Process in Egypt

Small area estimation is any of several statistical techniques involving the estimation of parameters for small sub-populations, generally used when the sub-population of interest is included in a larger survey. The term “small area” in this context generally refers to a small geographical area such as a district, zone or city. It may also refer […]

### Small area models with misclassified covariates

Modern small area estimation methods focus on mixed effects regression models that link the small areas and borrow strength from similar domains. However, when the auxiliary variables used in the models are measured with error, small area estimators that ignore such error may be worse than direct estimators ([1], [2]). In regression models, the presence […]

### Improving Small Area Estimates of Disability: Combining the American Community Survey with the Survey for Income and Program Participation

The Survey of Income and Program Participation (SIPP) is designed to make national level estimates of changes in income, eligibility for and participation in transfer programs, household and family composition, labor force behavior, and other associated events. Used cross-sectionally, the SIPP is the source for commonly accepted estimates of disability prevalence, having been cited in […]

### Bayesian non parametric multiresolution estimation for the American Community Survey

Bayesian hierarchical methods implemented for small area estimation focus on reducing the noise variation in published government official statistics by borrowing information among dependent response values. Even the most flexible models confine parameters defined at the finest scale to link to each data observation in a one-to-one construction. We propose a Bayesian multiresolution formulation that utilizes an ensemble of […]

### Small area estimation methods under cut-off sampling

The OECD defines cut-off sampling as a sampling procedure in which a predetermined threshold is established with all units in the population and all units at or below (above) the threshold are excluded from the possible selection in a sample. This sampling technique is typically used in business surveys, in which small firms are included […]

### Small Area Estimation in Case of Nonresponse: A Cautious Approach

In the context of Small are estimation (SAE), nonresponse may seriously reduce the already small sample size. Accordingly, a joint consideration of both problems is especially challenging. For survey practitioners, it has been a common practice to use weighting and imputation to mitigate nonresponse. Both techniques achieve point-identifiability by imposing the assumption of missing at […]

### Small area estimation for grouped data

We are concerned with small area estimation for grouped data or frequency distribution. There are two fundamental models for model-based small area estimation: one is the Fay-Herriot model for area level data, the other is the nested error regression model for unit level data. Because it is difficult to access unit level data in many […]

### New Methods for Small Area Estimation with Linkage Uncertainty

In Official Statistics, interest for data integration has been increasingly growing, due to the need of extracting information from different sources. However, the effects of these procedures on the validity of the resulting statistical analyses has been disregarded for a long time. In recent years, it has been largely recognized that linkage is not an […]

### Measurement Error in Small Area Estimation: Functional vs. Structural vs. Naive models

Small area estimation using area-level models can sometimes benefit from covariates that are observed subject to random errors. When it is possible to estimate the variances of these errors across small areas, then one can account for the uncertainty in such covariates using measurement error models (e.g., Ybarra and Lohr, 2008). For instance, estimates of […]

### A new bias calibration approach for robust inequality indices

Today the availability of rich sample surveys provides a ground for researchers and policy makers to pursue more ambitious objectives. This information in line with auxiliary data coming through administrative channels are used for a better prediction/estimation of social and economic indices, e.g. inequality or poverty measures, that can help to determine more precisely their […]

### Estimation of quantiles based on Fay-Herriot models

Central banks and politics often use robust measures like quantiles in order to describe the distribution of income or wealth in a country. However, estimates on a disaggregated level are rarely reported due to small sample sizes and following large variances. Small area estimation is one way to handle this issue but standard approaches are […]

### Small area estimation of poverty indicators using interval censored income data

Extreme poverty rates have been cut by more than half since 1990. While this is a remarkable achievement, it is still one of the main goals defined by the United Nations to eradicate extreme poverty by 2030. To fight poverty, it is essential to have knowledge about its spatial distribution. Small area methods enable the […]

### Domain Estimation of Survey Discontinuities

National Statistical Institutes (NSIs) conduct repeated sample surveys with the aim of analyzing change over time. Although NSIs try to maintain consistent survey design methodologies, modifications and redesigns of long-standing survey processes are sometimes necessary. Redesigning a survey can affect non-sampling errors and therefore can lead to systematic differences on survey estimates over time. These […]

### Constrained Empirical Bayes Estimation in Multiplicative Area-Level Models with Risk Analysis Under an Asymmetric Loss Function

Consider the problem of benchmarking small area estimates under multiplicative models with positive parameters where estimates are constrained to aggregate to direct estimates for the larger geographical areas. Constrained (hierarchical) empirical Bayes estimators of positive small area parameters under the conventional squared error loss function can take negative values. In this paper, we propose a […]

### Social media as a data source for official statistics; the Dutch Consumer Confidence Index

One way to use big data sources in the production of official statistics is to use them as auxiliary information in models for small area estimation procedures. Marchetti et al. (2015) used mobility data to predict poverty in a Fay Herriot model that improves the effective sample size with sample information from other domains. Most […]

### A Comparison of Alternative Methods for Poverty Estimation in Developing Countries

Small area estimation (SAE) has been widely used as an indirect estimation technique for geographic profiling of poverty indicators. Three unit-level SAE techniques: the method of Elbers, Lanjouw, and Lanjouw (2003) also known as ELL or World Bank method, the Empirical Best Prediction (EBP) method of Molina and Rao (2010), and the M-Quantile (MQ) method […]

### Spatial boundary changes overtime: small area estimation approach to maintain compatibility of data

Sample survey data are mainly collected based on fixed boundaries, however, countries such as South Africa experience frequent administrative boundary changes causing difficulties in producing comparable statistics through time. Rindfuss et. al (2004) stated that unless a consistent geographical approach is taken with time-series data, it cannot be known whether changes in the relationships between […]

### Small area model diagnostics and validation with applications to the U.S. Voting Rights Act Section 203

In this talk we consider the dual problems of choosing between competing small area models and validating model assumptions in an area-level model. Many classes of small area models result in an estimate that is a convex combination of the direct and the synthetic estimate for a given area. Therefore, competing models may share the […]

### An anticipated variance approach to estimating sampling variances for small area estimation problems

Area-level small area estimation models typically use both the point estimates from a survey and their corresponding design-based sampling variances as inputs. While the point estimates themselves are carefully modeled, the estimated sampling variances are often assumed known in small area models. In practice, for areas with small effective sample sizes the accuracy of design-based […]

### Robust Empirical Bayes Small Area Estimation with Density Power Divergence

Empirical Bayes estimators are widely used to provide indirect and model-based estimates of means in small areas. The most common model is a two-stage normal hierarchical model called Fay-Herriot model. However, due to the normality assumption, it might be highly influenced by the presence of outliers. In this talk, we propose a simple modification of […]

### Maximum likelihood estimation of odds ratios with application to prediction of deduplicated audience under marginal constraints

One of the key audience estimates in marketing and media research is the number of unique persons in a subpopulation that viewed a television network, program, or episode in either one of two platforms: traditional television or digital media. The latter platform includes a personal computer (PC), mobile device, or electronic notebook. The viewing may […]

### Small area estimation for functional data applied to mean electricity consumption curves estimation

In the French electricity company EDF, there is a growing need to estimate the mean electricity consumption curves for small geographic areas such as regions, cities or districts. To that aim we use samples of thousands of individual consumption curves recorded by smart meters at a very fine temporal scale and collected according to a […]

### Zero-inflated Spatio-temporal Models for Small Areas

In this talk, our aim is to study geographical and temporal variability of disease incidence when spatio-temporal count data have excess zeros. To that end, we consider random effects in zero-inflated Poisson models to investigate geographical and temporal patterns of disease incidence. Spatio-temporal models that employ conditionally autoregressive smoothing across the spatial dimension and B-spline […]

### Model-based Small Area Estimation for Cancer Screening and Smoking Related Measures

National health surveys, such as the National Health Interview Survey (NHIS), the Behavioral Risk Factor Surveillance System (BRFSS), and the Tobacco Use Supplement to the Current Population Survey (TUS-CPS), have been used to collect data on cancer screening and smoking related measures in the U.S. noninstitutionalized population. These surveys are designed to produce reliable estimates […]

### Some SAE methods in Health and Medical Studies

In many problems of health and medical studies, the interest is primarily on individuals, or small groups of individuals. For example, such problems arise in personalized health service. Statistically, the quantities of interest can be expressed as mixed effects, and the statistical challenges can be associated to prediction of mixed effects under a mixed-effects model. […]

### Quantifying and Mitigating Spatial Aggregation Error

The modifiable areal unit problem and the ecological fallacy are known problems that occur when modeling multiscale spatial processes. We investigate how these forms of spatial aggregation error can be mitigated and guide a regionalization over a spatial domain of interest. By “regionalization” we mean a specification of geographies that define the spatial support for […]

### Small Area Estimation for High-Dimensional Multivariate Spatio-Temporal Count Data

Small area estimation of count data has become a research topic of widespread interest due to the ever-increasing need to produce more precise estimates for undersampled/unsampled geographies. This problem becomes more exacerbated when one acknowledges that many data sources also report related variables of interest that are referenced at different levels of spatial aggregation and […]

### A spatial conditional approach to modelling unemployment and poverty in the counties of Missouri

Efficient multivariate estimation for small areas requires that both the multivariate and the spatial nature of the dependence be recognised. However, building a dependence model for all possible combinations of two or more variables and their locations in a discretely indexed domain is not easy, since any covariance matrix that is derived from such a […]

### Small Area Estimation by Combining Information from Multiple Data Sources on Correlated Variables at Different Levels of Aggregation

Demands for small area estimates are ever increasing and are useful for the local policy evaluation and implementation. Increasing concerns about privacy and confidentiality is preventing agencies from providing data at the desired level of geography. This paper develops procedures for combining information from multiple data sources that provide data at different levels of aggregation […]

### Fitting small area models under informative sampling design and nonignorable nonresponse

Two-stage sampling is frequently used in small areas. When the selection probabilities are related to the values of the response variable, even after conditioning on concomitant variables included in the population model, the sample design is defined as informative. This may result in selection bias. In addition to the effect of applying an informative sampling […]