Loading Events

Statistics Colloquium by Alessandra Miranda '20, Wednesday, February 26

Wed, February 26th, 2020
1:10 pm
- 1:50 pm

  • This event has passed.
Image of Stetson Court classroom

A Successive Random Sampling Approach to Clustering Global Climate Data and Other Large Datasets by Alessandra Miranda ’20, Statistics Colloquium, Wednesday, February 26, 1:10 – 1:50 pm, Stetson Court Classroom 105

Abstract:  Presently, there exists no definitive set of quantitatively-defined climate types for Earth.  The Köppen-Geiger Classification (KGC) provides qualitative guidelines for grouping the earth into either five or thirteen distinct climate types, but this system lacks precision and does not lend itself to modeling.  Netzel and Stepinski (2015) propose implementing partition-based clustering algorithms to define climate types quantitatively using localized temperature and precipitation data.  This would allow for more robust analyses of climate change as observed through changes in climate type representation and climate type boundaries over time.  But such work requires the analysis of large high-resolution datasets, which is time-intensive, memory-intensive, and not easily parallelizable. I propose an alternative implementation of the k-means clustering algorithm in which we cluster successive random samples from the dataset and limit the number of clustering iterations.  I will outline the steps of this process and present the preliminary results of testing the algorithm on the WorldClim global climate dataset (using a range of various sample sizes and numbers of samples).  I will conclude by assessing the algorithm’s performance on simulated data.

Event/Announcement Navigation

Related Events