Object detection and recognition are the most important and challenging problems in computer vision. The remarkable advancements in deep learning techniques have significantly accelerated the momentum of object detection/recognition in recent years. Meanwhile, scene text detection/recognition is also a critical task in computer vision and has gotten more attention from many researchers due to its wide range of applications. This work focuses on detecting and recognizing multiple retail products stacked on the shelves and off the shelves in the grocery stores by identifying the label texts. In this paper, we proposed a new framework is composed of three modules: (a) Retail product detection, (b) Product-text detection (c) Product-text recognition. In the first module, on-the-shelf and off-shelf retail products are detected using the YOLOv5 object detection algorithm. In the second module, we improve the performance of the state-of-the-art text detection algorithm named, “TextSnake”, by replacing the backbone network (ResNet50 + FPN) and a post-processing technique, WHBBR (Width Height based Bounding Box Reconstruction), is proposed to detect regular and irregular text. In the final module, we used a text recognition network named “SCATTER” to recognize the retail product's text information. The YOLOv5 algorithm accurately detects both on-the-shelf and off-the-shelf grocery products from the video frames and the static images. The experimental results show that the proposed text reconstruction approach WHBBR improves the performance of the state-of-the-art techniques on both regular and irregular text. The enhanced text detection and incorporated text recognition methods greatly support our proposed framework to recognize the on-the-shelf retail products by extracting product information such as product name, brand name, price, expiring date, etc. The recognized text contexts around the retail products can be used as the identifier to distinguish the product.