Content Tags

Chipmunk: A Systolically Scalable 0.9 mm${}^2$, 3.08 Gop/s/mW @ 1.2 mW Accelerator for Near-Sensor Recurrent Neural Network Inference.

RSS Source

Authors

Francesco Conti, Lukas Cavigelli, Gianna Paulin, Igor Susmelj, Luca Benini

Recurrent neural networks (RNNs) are state-of-the-art in voiceawareness/understanding and speech recognition. On-device computation of RNNson low-power mobile and wearable devices would be key to applications such aszero-latency voice-based human-machine interfaces. Here we present Chipmunk, asmall (<1 mm${}^2$) hardware accelerator for Long-Short Term Memory RNNs in UMC65 nm technology capable to operate at a measured peak efficiency up to 3.08Gop/s/mW at 1.24 mW peak power. To implement big RNN models without incurringin huge memory transfer overhead, multiple Chipmunk engines can cooperate toform a single systolic array. In this way, the Chipmunk architecture in a 75tiles configuration can achieve real-time phoneme extraction on a demanding RNNtopology proposed by Graves et al., consuming less than 13 mW of average power.

Download PDF

More news

Continue reading and listening

Stay in the loop.

Subscribe to our newsletter for a weekly update on the latest podcast, news, events, and jobs postings.