Confidence and Venn Machines and Their Applications to Proteomics

Devetyarov, Dmitry

(2011)

Devetyarov, Dmitry (2011) Confidence and Venn Machines and Their Applications to Proteomics.

Our Full Text Deposits

Full text access: Open

Full text file - 2.17 MB

Abstract

When a prediction is made in a classification or regression problem, it is useful to have additional information on how reliable this individual prediction is. Such predictions complemented with the additional information are also expected to be valid, i.e., to have a guarantee on the outcome. Recently developed frameworks of confidence machines, category-based confidence machines and Venn machines allow us to address these problems: confidence machines complement each prediction with its confidence and output region predictions with the guaranteed asymptotical error rate; Venn machines output multiprobability predictions which are valid in respect of observed frequencies. Another advantage of these frameworks is the fact that they are based on the i.i.d. assumption and do not depend on the probability distribution of examples. This thesis is devoted to further development of these frameworks. Firstly, novel designs and implementations of confidence machines and Venn machines are proposed. These implementations are based on random forest and support vector machine classifiers and inherit their ability to predict with high accuracy on a certain type of data. Experimental testing is carried out. Secondly, several algorithms with online validity are designed for proteomic data analysis. These algorithms take into account the nature of mass spectrometry experiments and special features of the data analysed. They also allow us to address medical problems: to make early diagnosis of diseases and to identify potential biomarkers. Extensive experimental study is performed on the UK Collaborative Trial of Ovarian Cancer Screening data sets. Finally, in theoretical research we extend the class of algorithms which output valid predictions in the online mode: we develop a new method of constructing valid prediction intervals for a statistical model different from the standard i.i.d. assumption used in confidence and Venn machines.

Information about this Version

This is a Approved version
This version's date is: 2011
This item is not peer reviewed

Link to this Version

https://repository.royalholloway.ac.uk/items/b28d84e9-af56-2345-78a1-3d2a3a0dc536/10/

Item TypeThesis (Doctoral)
TitleConfidence and Venn Machines and Their Applications to Proteomics
AuthorsDevetyarov, Dmitry
DepartmentsFaculty of Science\Computer Science

Identifiers

Deposited by Research Information System (atira) on 18-Nov-2014 in Royal Holloway Research Online.Last modified on 15-Feb-2017


Details