Factored LT and Factored Raptor Codes for Large-Scale Distributed Matrix Multiplication
We propose two coding schemes for distributed matrix multiplication in the presence of stragglers. These coding schemes are adaptations of LT codes and Raptor codes to distributed matrix multiplication and are termed factored LT (FLT) codes and factored Raptor (FR) codes. Empirically, we show that FLT codes have near-optimal recovery thresholds when the number of worker nodes is very large, and that FR codes have excellent recovery thresholds while the number of worker nodes is moderately large. FLT and FR codes have better recovery thresholds when compared to Product codes and they are expected to have better numerical stability when compared to Polynomial codes, while they can also be decoded with a low-complexity decoding algorithm.
READ FULL TEXT