Quantization is commonly used to compress and accelerate deep neural
net...
We motivate a method for transparently identifying ineffectual computati...
We show that selecting a fixed precision for all activations in Convolut...
We show that, during inference with Convolutional Neural Networks (CNNs)...
Tartan (TRT), a hardware accelerator for inference with Deep Neural Netw...
Loom (LM), a hardware inference accelerator for Convolutional Neural Net...
Stripes is a Deep Neural Network (DNN) accelerator that uses bit-serial
...