Publication

SRAM optimized porting and execution of machine learning classifiers on MCU-based IoT devices: Demo abstract

Sudharsan, Bharath
Patel, Pankesh
Breslin, John G.
Ali, Muhammad Intizar
Citation
Sudharsan, Bharath, Patel, Pankesh, Breslin, John G., & Ali, Muhammad Intizar. (2021). SRAM optimized porting and execution of machine learning classifiers on MCU-based IoT devices: Demo abstract. Paper presented at the Proceedings of the ACM/IEEE 12th International Conference on Cyber-Physical Systems, Nashville, Tennessee, 19-21 May, https://doi.org/10.1145/3450267.3451999
Abstract
With the introduction of edge analytics, IoT devices are becoming smarter and ready for AI applications. However, any increase in the training data results in a linear increase in the space complexity of the trained Machine Learning (ML) models, which means they cannot be deployed on IoT devices that have limited memory. To alleviate such memory issues, we recently proposed an SRAM-optimized classifier porting, stitching, and efficient deployment method in [3]. This is currently the most resource-friendly approach that enables large classifiers to be comfortably executed on microcontroller unit (MCU) based IoT devices, and perform ultra-fast classifications (1--4x times faster than state-of-the-art libraries) while consuming 0 bytes of SRAM. In this demo, realizing our recent SRAM-optimized approach, we port and execute 7 dataset-trained classifiers on 7 popular MCUs, and report their inference performance. It is apparent from the demo results that realizing our approach makes even the slowest Atmega328P MCU perform faster unit inference than a NVIDIA Jetson Nano GPU and Raspberry Pi 4 CPU.
Publisher
Association for Computing Machinery (ACM)
Publisher DOI
10.1145/3450267.3451999
Rights
Attribution 4.0 International (CC BY 4.0)