Convolutional Neural Networks (CNNs), as a computational framework for deep learning, have gained preference over traditional machine learning techniques and simple, fully- connected neural networks due to their remarkable performance for AI applications related to images and videos, in general, computer vision. Recent advancements in CNN architectures pave the way for significant contributions across various application domains, including disease diagnosis, object detection, and image classification systems. This article elaborates on the fundamental process of CNN with mathematical insights. It provides a brief overview of the mathematical and computational intricacies that underpin CNNs. It presents streamlined derivations of key components, focusing on the pivotal mechanism of the backpropagation. A pseudo-code of the generic algorithm for CNN is presented both in component form and vectorized form, alongside an exhaustive explanation of the relevant data structures to foster comprehensive understanding. The article includes an application of classifying retinal images for diagnosing diabetic retinopathy (DR). Having started the discussion about implementation from scratch, the article sheds light on transfer learning using pre-trained models for sustained efficiency. For this, performance gains are demonstrated using state-of-the-art CNN architecture for the DR classification. This way, the article aims to equip learners, researchers, and practitioners with mathematical insights into the working of CNN for proper comprehension and stepping towards efficient model development for sustainable advancements, especially for disease diagnosis. The understanding and expertise related to CNN would add to the development of large-scale and sustainable solutions based on AI in the health sector, supporting the United Nations’ agenda 2030 for sustainability.
ACM Classification Codes (ccs98): I.2.6, I.5.1, K.3.2. MSC Codes (2020): 68T07, 68T45, 92B20