Separability Index in Supervised Learning
Djamel A. Zighed, Stéphane Lallich, and Fabrice Muhlenbach
ERIC Laboratory
Lumière University -- Lyon II
5 avenue Pierre Mendès-France
F-69676 BRON Cedex -- FRANCE
{zighed,lallich,fmuhlenb}@univ-lyon2.fr
Abstract
We propose a new statistical approach for characterizing the class separability
degree in Rp. This approach is based on a nonparametric statistic called
the Cut Edge Weight. We show in this paper the principle and the
experimental applications of this statistic. First, we build a geometrical
connected graph like the Relative Neighborhood Graph of Toussaint on all
examples of the learning set. Second, we cut all edges between two examples of
a different class. Third, we calculate the relative weight of these cut edges.
If the relative weight of the cut edges is in the expected interval of a random
distribution of the labels on all the neighborhood graph's vertices, then no
neighborhood-based method will give a reliable prediction model. We will say
then that the classes to predict are non-separable.