The proliferation of edge devices has unlocked unprecedented opportunities for deep learning model deployment in computer vision applications. However, these complex models require considerable power, memory and compute resources that are typically not available on edge platforms. Ultra low-bit quantization presents an attractive solution to this problem by scaling down the model weights and activations from 32-bit to less than 8-bit. DeepliteRT is an end-to-end solution for the compilation, tuning, and inference of ultra low-bit models on ARM devices. We implement highly optimized ultra low-bit convolution operators for ARM-based targets that outperform existing methods by up to 4.34x. Accepted at the BMVC2023 conference, you can read the full paper on arXiv. Check out our poster presentation video here as our very own Saad Ashfaq walks you through DeepliteRT!