Dynamic computation has emerged as a promising avenue to enhance the
inf...
Efficient inference for object detection networks is a major challenge o...
Large-scale language models (LLMs) have demonstrated outstanding perform...
Post-training quantization (PTQ) is a popular method for compressing dee...
As a neural network compression technique, post-training quantization (P...
Denoising diffusion (score-based) generative models have recently achiev...
Spatial-wise dynamic convolution has become a promising approach to impr...
Quantization is one of the most effective methods to compress neural
net...
Network quantization is a powerful technique to compress convolutional n...
Tiled spatial architectures have proved to be an effective solution to b...
Dynamic inference is a feasible way to reduce the computational cost of
...
Recently, dynamic inference has emerged as a promising way to reduce the...