RIn-Close_CVC2: an even more efficient enumerative algorithm for biclustering of numerical datasets

10/17/2018
by   Rosana Veroneze, et al.
0

RIn-Close_CVC is an efficient (take polynomial time per bicluster), complete (find all maximal biclusters), correct (all biclusters attend the user-defined level of consistency) and non-redundant (all the obtained biclusters are maximal and the same bicluster is not enumerated more than once) enumerative algorithm for mining maximal biclusters with constant values on columns in numerical datasets. Despite RIn-Close_CVC has all these outstanding properties, it has a high computational cost in terms of memory usage because it must keep a symbol table in memory to prevent a maximal bicluster to be found more than once. In this paper, we propose a new version of RIn-Close_CVC, named RIn-Close_CVC2, that does not use a symbol table to prevent redundant biclusters, and keeps all these four properties. We also prove that these algorithms actually possess these properties. Experiments are carried out with synthetic and real-world datasets to compare RIn-Close_CVC and RIn-Close_CVC2 in terms of memory usage and runtime. The experimental results show that RIn-Close_CVC2 brings a large reduction in memory usage and, in average, significant runtime gain when compared to its predecessor.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset