This blog is a summary of the research paper Sah, Sudakar, et al. "Token Pruning using a Lightweight Background Aware Vision Transformer", accepted at the NeurIPS 2024 FITML Workshop. In the realm of computer vision, Vision Transformers (ViTs) have emerged as a powerful tool, outperforming traditional convolutional neural networks (CNNs) in various tasks such as object detection, segmentation, and classification. However, their high computational requirements pose significant challenges, especially for deployment on edge devices with limited memory and processing power. Addressing this issue, we introduced an innovative solution: the Background Aware Vision Transformer (BAViT).