Optimal N-ary ECOC Matrices for Ensemble Classification
A new recursive construction of N-ary error-correcting output code (ECOC) matrices for ensemble classification methods is presented, generalizing the classic doubling construction for binary Hadamard matrices. Given any prime integer N, this deterministic construction generates base-N symmetric square matrices M of prime-power dimension having optimal minimum Hamming distance between any two of its rows and columns. Experimental results for six datasets demonstrate that using these deterministic coding matrices for N-ary ECOC classification yields comparable and in many cases higher accuracy compared to using randomly generated coding matrices. This is particular true when N is adaptively chosen so that the dimension of M matches closely with the number of classes in a dataset, which reduces the loss in minimum Hamming distance when M is truncated to fit the dataset. This is verified through a distance formula for M which shows that these adaptive matrices have significantly higher minimum Hamming distance in comparison to randomly generated ones.
READ FULL TEXT