Domain-specific accelerators are used in various computing systems rangi...
Many state-of-the-art deep learning models for computer vision tasks are...
The architecture of a coarse-grained reconfigurable array (CGRA) interco...
While coarse-grained reconfigurable arrays (CGRAs) have emerged as promi...
We propose the Sparse Abstract Machine (SAM), an intermediate representa...
High-level hardware generators have significantly increased the producti...
Achieving high code reuse in physical design flows is challenging but
in...
The increasing complexity of modern configurable systems makes it critic...
Image processing and machine learning applications benefit tremendously ...
The architecture of a coarse-grained reconfigurable array (CGRA) process...
Using digital standard cells and digital place-and-route (PnR) tools, we...
While hardware generators have drastically improved design productivity,...
In this paper, we propose an architecture for FPGA emulation of mixed-si...
Real-time CNN based object detection models for applications like
survei...
We propose an end-to-end framework for training domain specific models (...
Many DNN accelerators have been proposed and built using different
micro...
FPMax implements four FPUs optimized for latency or throughput workloads...
Convolutional Neural Networks (CNNs) are the state of the art solution f...
We propose an explanatory and computational theory of transformative
dis...