We will have three talks on Wednesday, June 7th, starting at about 14h30.
:: On the Stability of Feature Selection Algorithms
::Speaker: Gavin Brown (University of Manchester)
Feature Selection is central to modern data science, from exploratory
data analysis to predictive model-building. The stability of a feature
selection algorithm refers to the robustness of its feature preferences
with respect to small changes in the training data. An algorithm is
‘unstable’ if a small change in data leads to large changes in the
chosen feature subset. We present a rigorous statistical and axiomatic
treatment for this concept, applicable generically from bioinformatics
to business analytics. In particular we address is how best to measure
stability – in the literature we find numerous proposals, each with
different motivations. In this work we consolidate the literature and
suggest a new approach to the problem. The result is (1) a deeper
understanding of existing work based on a small set of axioms, and (2) a
clearly justified statistical framework with several novel benefits.
This approach serves to identify a stability measure obeying all
desirable axioms, and (for the first time in the literature) allowing
confidence intervals on its estimates, enabling a more rigorous
comparison of feature selection algorithms.
:: Distinguishing prognostic and predictive biomarkers: An information
:: Speaker: Kostantinos Sechidis (University of Manchester)
We present a novel method for data-driven ranking of predictive
biomarkers, using information theoretic methods. A strength of the
approach is in explicitly distinguishing predictive vs prognostic
markers, allowing us to quantify when markers are solely predictive,
solely prognostic, or some mixture of the two. Our information
theoretic formalization of the problem enable us to derive biomarker
rankings that capture the predictive strength, by estimating different,
high dimensional, conditional mutual information terms. To estimate
these terms, we suggest efficient low dimensional approximations, and we
derive an empirical Bayes ranking procedure, which is suitable for
"small n, large p" scenarios. Our approach turns out to be an asset in
small sample scenarios, when noise factors may dominate and markers get
mistakenly identified as predictive, when in fact they are just strongly
prognostic. We propose that the information theoretic view is a natural
and flexible mathematical framework for data-driven biomarker discovery,
providing a natural algebra to discuss and quantify the `predictiveness'
and `prognosticness' of candidate biomarkers.
::Hierarchical Multinomial-Dirichlet model for estimating of
conditional probabilities and mutual information
::Speaker: Laura Azzimonti (Idsia)
We present a novel approach for estimating conditional probability
tables based on a hierarchical Multinomial-Dirichlet model, which
relaxes the traditional local independence assumption. We derive exact
analytical expressions for the estimators and we analyse their
properties both analytically and via simulation. We then apply this
method to the estimation of parameters in a Bayesian network. Given the
structure of the network, the proposed approach better estimates the
joint distribution and significantly improves the classification
performance with respect to traditional parameter estimation approaches.
In the end, we apply the hierarchical Multinomial-Dirichlet model to the
estimation of mutual information. The proposed mutual information
estimator is then compared to other traditional estimators via simulations.