Precious: Resource-Demand Estimation for Embedded Neural Network Accelerators

Publication by Stefan Reif, Benedict Herzog, Judith Hemp, Timo Hönig, Wolfgang Schröder-Preikschat
Related to the Energy-, Latency- And Resilience-aware Networking (e.LARN) project
Published in Proceedings of the 1st International Workshop on Benchmarking Machine Learning Workloads on Emerging Hardware (Challenge'20), 2020

Paper

Abstract:

The recent advances of hardware-based accelerators for machine learning—in particular neural networks—attracted the attention of embedded-systems designers and engineers. Since embedded systems usually operate with strict resource constraints, knowledge about the resource demand (ie, time and power) for executing machine-learning workloads is key. This paper presents Precious, an approach, as well as a practical implementation, of a system that estimates execution time and power draw of convolutional and fully-connected neural networks that execute on a commerciallyavailable off-the-shelf embedded accelerator hardware for neural networks (ie, Google Coral Edge TPU).