Finding Spatio-Temporal Patterns in Earth Science Data

This work presents preliminary work in using data mining techniques to find interesting spatio-temporal patterns from Earth Science data. The data consists of time series measurements for various Earth science and climate variables (e.g. soil moisture, temperature, and precipitation), along with additional data from existing ecosystem models (e.g. Net Primary Production). The ecological patterns of interest include associations, clusters, predictive models, and trends. In this work, we discuss some of the challenges involved in preprocessing and analyzing the data, and also consider techniques for handling some of the spatio-temporal issues. Earth Science data has strong seasonal components that need to be removed prior to pattern analysis, as Earth scientists are primarily interested in patterns that represent deviations from normal seasonal variation such as anomalous climate events (e.g., El Nino) or trends (e.g., global warming). We compare several alternatives (including singular value decomposition (SVD), discrete Fourier transform (DFT), "monthly" Z score, and moving average) with respect to their effectiveness in removing seasonality. We describe the different kinds of association analysis that can be performed on such data. Our current technique for finding associations transforms the time series into transactions and then applies existing algorithms traditionally used for market-basket data. Some of the transformations lead to dense columns in the transaction matrices, causing an exponential growth in the computing requirements. Furthermore, no single interestingness measure accurately reflects the quality of the derived patterns. Indeed, we argue that existing approaches for mining association rules and sequential patterns may not be able to capture all the interesting patterns due to the spatio-temporal nature of this data.