Author: Christopher Wikle (Department of Statistics, University of Missouri)
The modifiable areal unit problem and the ecological fallacy are known problems that occur when modeling multiscale spatial processes. We investigate how these forms of spatial aggregation error can be mitigated and guide a regionalization over a spatial domain of interest. By “regionalization” we mean a specification of geographies that define the spatial support for areal data. This topic has been studied vigorously by geographers, but has been given less attention by spatial statisticians. Thus, we propose a criterion for spatial aggregation error (CAGE), which we minimize to obtain an optimal regionalization. To define CAGE we draw a connection between spatial aggregation error and a new multiscale representation of the truncated Karhunen-Loeve (K-L) expansion. This relationship between CAGE and the multiscale truncated K-L expansion leads to illuminating theoretical developments including: connections between spatial aggregation error and squared prediction error, and a novel extension of so-called Obled-Creutin eigenfunctions. The effectiveness of our approach is demonstrated through the analysis of applications involving federal survey data.