Automatic Generation of Vectorized Montgomery Algorithm
Modular arithmetic is widely used in crytography and symbolic computation. This paper presents a vectorized Montgomery algorithm for modular multiplication, the key to fast modular arithmetic, that fully utilizes the SIMD instructions. We further show how the vectorized algorithm can be automatically generated by the system, as part of the effort for automatic generation of a modular polynomial multiplication library.
READ FULL TEXT