CAEMO - A Flexible and scalable high performance matrix algebra coprocessor for embedded reconfigurable computing systems
In Microprocessors and Microsystems, Elsevier, volume o.A., pages 47-63, Feb/2018.
Many applications in mobile and embedded systems like signal processing, ma-
chine learning, kinematics, dynamics, and control depend on computationally
expensive matrix operations. However, such systems underlie tight constraints
regarding power consumption and physical space, which prohibits the usage
of powerful multicore systems. In this paper, we propose a novel scalable and
power-efficient architecture for matrix algebra in FPGA-based Systems-on-Chip.
The architecture is based on a linear systolic array and has been developed with
a focus on
exibility in order to be adapted to different applications. We eval-
uate the performance, resource utilization and power consumption of different
configurations and show that it provides significant speed-ups over a mobile
processor and is significantly more power efficient than a standard PC.
Matrix algebra, Hardware acceleration, Embedded systems, FPGA