Murtagh, Fionn (2009) The Remarkable Simplicity of Very High Dimensional Data: Application of Model-Based Clustering. Journal of Classification, 26 (3).
Full text access: Open
An ultrametric topology formalizes the notion of hierarchical structure. Anultrametric embedding, referred to here as ultrametricity, is implied by ahierarchical embedding. Such hierarchical structure can be global in the dataset, or local. By quantifying extent or degree of ultrametricity in a data set, we show that ultrametricity becomes pervasive as dimensionality and/or spatial sparsity increases. This leads us to assert that very high dimensional data are of simple structure. We exemplify this finding through a range of simulated data cases. We discuss also application to very high frequency time series segmentation and modeling.
This is a Submitted version This version's date is: 2009 This item is not peer reviewed
https://repository.royalholloway.ac.uk/items/19df89d8-6daa-6265-29aa-2189238d8fa5/1/
Deposited by Research Information System (atira) on 24-May-2012 in Royal Holloway Research Online.Last modified on 24-May-2012
36 pages, 18 figures, 36 references