Murtagh, Fionn (2009) Symmetry in Data Mining and Analysis: A Unifying View based on Hierarchy. Proceedings of the Steklov Institute of Mathematics, 265
Full text access: Open
Data analysis and data mining are concerned with unsupervised pattern finding and structure determination in data sets. The data sets themselves areexplicitly linked as a form of representation to an observational or otherwise empirical domain of interest. "Structure" has long been understood as symmetry which can take many forms with respect to any transformation, including point, translational, rotational, and many others. Beginning with the role of numbertheory in expressing data, we show how we can naturally proceed to hierarchical structures. We show how this both encapsulates traditional paradigms in data analysis, and also opens up new perspectives towards issues that are on theorder of the day, including data mining of massive, high dimensional,heterogeneous data sets. Linkages with other fields are also discussedincluding computational logic and symbolic dynamics. The structures in datasurveyed here are based on hierarchy, represented as p-adic numbers or anultrametric topology.
This is a Submitted version This version's date is: 2009 This item is not peer reviewed
https://repository.royalholloway.ac.uk/items/93fc8607-6803-9afc-07b9-24de5fb66215/2/
Deposited by Research Information System (atira) on 30-May-2012 in Royal Holloway Research Online.Last modified on 30-May-2012
35 pages, 3 figures, 84 references