Separability Index in Supervised Learning


Djamel A. Zighed, Stéphane Lallich, and Fabrice Muhlenbach

ERIC Laboratory
Lumière University -- Lyon II
5 avenue Pierre Mendès-France
F-69676 BRON Cedex -- FRANCE
{zighed,lallich,fmuhlenb}@univ-lyon2.fr


Abstract

We propose a new statistical approach for characterizing the class separability degree in Rp. This approach is based on a nonparametric statistic called the Cut Edge Weight. We show in this paper the principle and the experimental applications of this statistic. First, we build a geometrical connected graph like the Relative Neighborhood Graph of Toussaint on all examples of the learning set. Second, we cut all edges between two examples of a different class. Third, we calculate the relative weight of these cut edges. If the relative weight of the cut edges is in the expected interval of a random distribution of the labels on all the neighborhood graph's vertices, then no neighborhood-based method will give a reliable prediction model. We will say then that the classes to predict are non-separable.