A Machine-learning Based Ensemble Method For Anti-patterns Detection
Anti-patterns are poor solutions to recurring design problems. Several empirical studies have highlighted their negative impact on program comprehension, maintainability, as well as fault-proneness. A variety of detection approaches have been proposed to identify their occurrences in source code. However, these approaches can identify only a subset of the occurrences and report large numbers of false positives. Furthermore, a low agreement is generally observed among different approaches. Recent studies have shown the potential of machine-learning models to improve this situation. However, such algorithms require large sets of manually-produced training-data, which often limits their application in practice. In this paper, we present SMAD (SMart Aggregation of Anti-patterns Detectors), a machine-learning based ensemble method to aggregate various anti-patterns detection approaches on the basis of their internal detection rules. We experiment SMAD on two well known anti-patterns, God Class and Feature Envy, and assess its performances on three open-source Java systems. Our results show that SMAD overcomes the previous limitations. First, our method clearly enhances the performances of the so aggregated approaches and outperforms competitive ensemble methods. Second, we show that such method can be used to generate reliable training data for machine-learning models from a reasonable number of manually-produced examples.
READ FULL TEXT 
  
  
     share
 share