The Permuted Striped Block Model and its Factorization – Algorithms with Recovery Guarantees
We introduce a novel class of matrices which are defined by the factorization Y :=AX, where A is an m × n wide sparse binary matrix with a fixed number d nonzeros per column and X is an n × N sparse real matrix whose columns have at most k nonzeros and are dissociated. Matrices defined by this factorization can be expressed as a sum of n rank one sparse matrices, whose nonzero entries, under the appropriate permutations, form striped blocks - we therefore refer to them as Permuted Striped Block (PSB) matrices. We define the PSB data model as a particular distribution over this class of matrices, motivated by its implications for community detection, provable binary dictionary learning with real valued sparse coding, and blind combinatorial compressed sensing. For data matrices drawn from the PSB data model, we provide computationally efficient factorization algorithms which recover the generating factors with high probability from as few as N =O(n/klog^2(n)) data vectors, where k, m and n scale proportionally. Notably, these algorithms achieve optimal sample complexity up to logarithmic factors.
READ FULL TEXT