Design and Analysis of Imaging Chip Using High-Speed AXI-Interface for MPSOC Applications on FPGA Platform

. The recent innovations in real-time video and image enhancements are allowing much advancement in a wide range of diverse applications. These innovations and advancements provide a new hardware architecture which aims to improve the image visualization, processing speed, and complexity reduction in hardware. The imaging chip concept is introduced in this article to support the Multiprocessing system on chip (MPSoC) applications in real-time scenarios on a single chip. The imaging chip model is designed using high-speed interface protocol, which includes different image enhancement algorithms acts as a master model, Advanced Extensible Interface (AXI)-4 as an interface model, and dual-port memory as a slave model. The image enhancement algorithm includes mainly, Brightness control, contrast stretching, Adaptive median filtering (AMF), Edge-detection techniques, image Thresholding, and Image Histogram method. The AXI-4 provides a high-speed interface for communicating master and slave modules. The proposed model work based on the modes of the operation to process the enhanced image output in MPSoC. The design supports multiple masters and multiple slave modules with reconfigurable nature. The imaging chip is a module on the Xilinx ISE environment and implemented on Artix-7 FPGA, along with the performance metrics like chip Area, time, power, and memory utilization are analyzed with improvements. The model offers low latency and high throughput architecture for real-time Multimedia applications. and results are multiplied Constant results right and map the pixel values from range 0 to 255. The mapped values are stored in register for the generation of Gamma correction Image. When mode-0100 is actived for the Mean filtering operation, the preprocessed output values are stored in 3X3 single window mask. Calculate the mean of all the 3x3 window pixel values and updated in Center pixel and the results are stored in register for the generation of mean filtered Image. When mode-0101 is actived for the Adaptive Median Filter (AMF), and it is one of the best filtering methods to suppress unwanted impulse noises.


Introduction 2 Related Work
This section elaborates on the detailed review of different image enhancement algorithms on hardware and software approaches with finding and also discussionof hardware approaches towards MPSOC using image processing applications.Dutta et al. [15] present a smart frame design using a machine learning approach for training and testing the demographic image dataset, which resides in reconfigurable MPSoC architecture.This method improves the execution time and throughput on the FPGA platform.The FPGA based MPSoC integrates with service-oriented architecture on a single chip to offer solutions to programmable interfaces, software chains, and a different module for diverse applications by Wang et al. [16]. These integrated module offers modularity, flexibility, and reconfigurable feature, but lags with scalable issues on the hardware platform. Karim et al. [17] present MPSOC based module for ECG applications on FPGA, which includes master and slave processor module for processing the software-based ECG data and displays the ECG data on FPGA's input-output switches.
The Huang et al. [18] explains the reconfigurable hardware architecture for image contrast enhancement algorithm, which includes statistical computation, Weighting Probability density, and cumulative distribution functions followed by Gamma correction and luminance transformation. The algorithm offered better image quality and failed to improve computational complexity in hardware. The fast algorithm for image enhancement using parallel architecture is designed by Singh et al. [19] with a polynomial based fractional order filter function, and it results in the better-enhanced image for a different order. It is not suitable for hardware-based modeling because of design complexity. The Pugazhenthi et al. [20] presents the automatic multi-histogram equalization as an image enhancement technique for satellite images using Matlab programming, which prevents the brightness and improve the contrast. The image quality results are not so appropriate and not suitable for hardware-based approaches. Mahajan et al. [21] present the image contrast enhancement using a Gaussian mixture model (GMM) and genetic algorithm. The GMM includes modeling, expectationmaximization method, partitioning, and mapping, which is applied for low contrast images to achieve better quality images. The image enhancement techniques for denoising and contrast enlargement on low light images by Li et al. [22] and for surface roughness detection system by Tian et al. [23] are discussed in detail with performance metrics improvements.
The logarithmic image processing models are designed by Zhao et al. [24] for medical image enhancement, which provides the solution to linear enhancement models using the Unsharp masking algorithm. The logarithmic image processing (LIP), generalized, and parameterized -LIP models are analyzed for different images to improves the enhancing edges and reduce noise sensitivity. Khoukhi et al. [25] presents the combination of Matlab -based and HDL modeling for image contrast and reshaping. The work analyzes the performance metrics and execution time, butthis type of modeling is not suitable for large multimedia applications. The Fuzzy rule-based image contrast enhancement model is designed by Liviu et al. [26] for real-time application on a hardware platform that offers computational complexity reduction for low contrast images with an illustration of heavy traffic conditions and not suitable for low-cost FPGA implementations. Medial image enhancement using a hardware-based approach is presented by chiuchisan [27] and which offers edge detection, contrast enhancement, brightness-adjustment, and sharpen operation on a single module. The real-time reconfiguration is complicated with this approach.
Aranda et al. [28] describe the median filter modeling with error detection for image processing applications. This model offers error detection capabilities during the injection of fault in the system. The pixel-level and image-level average error detection reports are analyzed with redundancy based approaches with noise ratio improvements. Kumar et al. [29] present the hardware modeling of the median filter with a window size of 5X5, and it offers low latency execution compared to existing approaches. This model mainly includes line buffers for sliding window creation and a sorter model for median filter creation. It is designed using a system generator with a co-simulator approach for image visualization. Cadenas et al. [30] present a pipelined median approach for noise reduction using the slice encoding technique. This method is simple and not suitable for multimedia applications. Tidala et al. [31] present the AXI-protocol interface for Network on Chip design on FPGA Platform, which offers on-chip communication quickly for multiple master and slave network and also resolves the real-time routing issues.
Research Gaps: The next level of research work focus will be an electronic generated digital image can be improved regarding its superiority by enhancing it. Numerous mechanisms have been presented in the field of image enhancement with Field Programmable Gate Array (FPGA). Incorporation of Advanced Microcontroller Bus Architecture (AMBA) based Advanced Extensible Bus (AXI) in the mechanism would provide a better interface for communication between the memory and medical image given as an input for enhancement reducing the effect of heterogeneity on MPSoC. The significant research problems of interface interconnection are as follows: i) Most of the existing interface modules includes bus-based architectures, shared bus connection, which are facing scalability and reliability problems. ii) Scarce works towards with AMBA based protocols like AHB, APB, and ASB. Even with AXI-4 protocol, very less work carried with a lot of constraints issues and iii) Complete AXI-4 interface protocol with high-speed architecture is yet to come with optimized constraints. The research problems for real-time image processing are addressed as follows: i) Most of the current research works of image processing algorithms and interface protocols are designed, individual. Could not process and retrieve the image at high speed. ii) Most of the interface protocols for image processing applications are available in the form of application-specific IP-cores, we can't reuse the IP-cores for other image processing applications. iii) The existing systems are failed to provide the standalone hardware architecture for Image processing applications on a single substrate.

Foundation
This section describes the theoretical foundation of the proposed Imaging chip model which mainly includes Imaging enhancements algorithms and AXI-Interface module.

Imaging enhancements algorithms
From the past decade, the growth and demand of low power, low-cost digital image systems are captured from Digital camera, PC camera, and video cam recorder. The CMOS Camera or CMOS Image sensor is used to capture the images, while it provides low-quality images. The enhancement algorithms are crucial in the image processing system, which gives the final definition of an image. The enhancement techniques are used to improve the visual appearance of the image in digital signal processing. The edges in the image are necessary, which gives the exact information for human eye perception. The aim is to the detection of pixel points in an image in which image brightness changes sharply. It is difficult to process the broad data for edge detection, which affects to speed of the operation. The location of intricate edges in standard edge detection is not accurate and perfect. So, Sobel enhancement operation is used to solve this locating intricate edges issue, but in software, it is challenging to meet Real-time requirements. Processing the high-speed images in real-time is a challenging task, which needs the high-speed interface Protocol and which interconnects to the external world. The proposed work considered few of the image enhancements algorithms for the Imaging Chip model and are explained below.

A. Brightness control operation
Brightness control provides brightness by adding or subtracting the constant value 'C' to the input image of each pixel 'IP (i)' and store the results in OP (i) after brightness control operation. The 'i' represents the input or output image pixel values. The equation (1) shows the brightness control operation as follows, The constant vales appeared within the range (0 to 255), and if the output pixel values exceed more than 255, the information will be loss of the object image. The brightness control supports dark or low light mode images for processing by adjusting the constant values.

B. Contrast stretching operation
Contrast Stretching is used to strengthen the intensity range of the object image. Before the normalization process, initialize the lower and upper-intensity values 'a' and 'b' respectively. The stretching operation generates the lowest 'c' and highest 'd' pixel values of the current image after the normalization process. The equation (2) shows the contrast stretching operation 'OP(i)' as follows,

C. Negative-Image transformation
The Negative-Image transformation operation works based on the gray value in the range of (0, T-1). While increasing the intensity of the input image and decrease the intensity of the output image to generate the negative of the original image from black to white and vice versa. The transformation of the negative image is formed using equation (3) is as follows, Where 'T' represents the number of gray values in the image.

D. Gamma correction
Gamma Correction is one of the conventional gray level transformations, and it is also known as power-law Transformation which includes n th power and n th root power transformation are expressed in equation (4) as follows, Where 'c' and 'r' are +ve constant values, Gamma (γ) values will be varied for enhancing the images at different display and monitor systems.

E. Mean filter
Mean filter is one of the conventional filtering methods for filtering the images from unwanted impulse noises. The mean filter replaces the center pixel (cp) value in the window matrix with an average (mean) of all the pixels in the window. The smoothing and noise level reduction is achieved by decrease the intensity of one pixel to other pixel values. The convolution matrix is used to calculate the mean by sampling the neighborhood pixels. In this design, the 3x3 window matrix in (5)

F. Median Filter
The Median filter are widely used for the removal of 'salt and pepper' noises. The median filter mainly used for high-speed image acquisition and video capturing in high quality cameras for the image denoising. The sorting methods like Bubble sort is used for sorting the pixels within window for filtering the image. The conventional median filtering methods are processing slow in real time image processing and difficult to use in dedicated hardware platform. In this work, Median filter is in the adaptive form and 3x3 window size is used for processing the noise input.

G. Edge Detection
Edge detection is used to process the detection of edges using Sobel and Prewitt operators; these operators are discrete differentiation operators used for the calculation of the gradient of image function. The Sobel and Prewitt operators are used for convoluting the neighborhood image pixel with filter coefficient values in horizontal and vertical directions.
The Sobel operator uses two gradients like Horizontal 'Gx,' and vertical 'Gy' derivative approximation to form 3X3 sliding window is shown in equation (6) as follows, Similarly, for the Prewitt edge detection, The 'Gx' and 'Gy' are represented in equation (7) as follows, The magnitude of the gradient of both Sobel and Prewitt edge detection is calculated by integrating both 'Gx,' and 'Gy' are represented in equation (8) as follows, 2 2 Gy

H. Image Thresholding
Thresholding is a segmentation method used to create binary images in gray level. If the threshold value (T) is higher than the input image, then the output of each pixel is replaced with the minimum '0' value (Black). Similarly, if the Threshold (T) values less than the input image, then the output of each pixel is replaced with the maximum '255' value (White).

I. Histogram of the image
Histogram is used to increase or decrease the contrast of the image. The histogram equalization is performed using cumulative distributive function (CDF) based gray levels and stretches the histogram of the input image from range 0-255 value.

Interface Protocols
The performance of MPSoC is determined by using on-chip buses for interconnection between multiple cores. These are traditional, can communicate master to slave one at a time which affects the performance of the whole MPSoC system. There are many interconnection bus protocols are available like Wishbone, AHB, and AXI provides interface solutions in MPSoC system. By using FPGA and ASIC prototyping, can improve the execution time and reduce the complexities of the whole system. In the present work, the high speed AXI interface bus is used for communicating the master modules with slave module. The AXI bus supports 16 Masters and 16 Slaves transactions at time.

Imaging Chip Model
The imaging chip model has three sub-modules includes the Master module, Interface, and slave modules. The overview of the Imaging chip using the AXI-4 interface is represented in the fig.1. The image input and output sizes are considered as 256x256. The detailed imaging chip sub-modules operation is elaborated in the below section.

Input Image
Preprocessing Module

Image Enhancement Algorithms
Register Module

Dual Port Memory
Slave Module Output Image

Master Module
The master module has a pre-processing image module for neighboring pixel creation followed by Image enhancement Module (IEM) for image processing operations and stores the processed images temporarily in the register module. The imaging chip works based on modes of operation for image enhancement modules and is tabulated in table 1 and theoretical operations are explained in earlier section 3.1. The complete Hardware module of Image enhancement Algorithms are represented in fig.2.
The first step of the imaging chip is a preprocessing module. The image preprocessing receives the input image stored in the temporary ROM-1, followed by the creation of neighboring pixels and stored in ROM-2. The neighbored pixels are extracted using 3X3 matrix generation operation and store it in ROM-3. The ROM-3 outputs are used for the image enhancement module as an input.The image enhancement module (IEM) has 10 different enhancement operations [32] works based on the modes for imaging chip-Master module creation. The IEM operations are explained as follows.
When mode-0000 is actived for the Brightness control operation, the preprocessed output is added with Constant (C) and results are compared with 255 and stored in register for the generation of brightness control Image. When mode-0001 is actived for the Contrast Stretching operation, the preprocessed output image computes the Minimum and maximum values using Counter method and followed by contrast operation using equation 2 and stored the results in register for the generation of Contrast Stretching image. When mode-0010 is actived for the Negative Image transformation operation, the preprocessed output image subtracted from Number of Gray level 'N' and stores the results in register for the generation of Negative image.
When mode-0011 is actived for the Gamma correction operation, the preprocessed output are multiplied by itself (p* p) and results are multiplied with Constant (C) and results are right shifted and map the pixel values from range 0 to 255. The mapped values are stored in register for the generation of Gamma correction Image. When mode-0100 is actived for the Mean filtering operation, the preprocessed output values are stored in 3X3 single window mask. Calculate the mean of all the 3x3 window pixel values and updated in Center pixel and the results are stored in register for the generation of mean filtered Image. When mode-0101 is actived for the Adaptive Median Filter (AMF), and it is one of the best filtering methods to suppress unwanted impulse noises.  The detailed diagram of the AMF module is represented in fig.3. AMF mainly has window 3X3 module, 3-input sorter, Adaptive computational block followed by error detection block. The 3X3 window operation module has 9 data flip-flops (DFF's) to form 9-different window values for median computation. The median filter has a sorter module to find the median values using Multiplexors and comparators. The maximum and minimum values are used in error detection to find the error value. The adaptive computation block finds the principal median value using Comparators and basic AND gates along the center window pixel. The adaptive computation adopts the suitable valid output value to find the mean filter value. The error detection block finds the error pixel value using median value along with maximum and minimum values. This error detection gives information about the corrupted and uncorrupted pixel of the filtered output image. When mode-0110 or 0111 is actived for the Sobel or Prewitt operation, the preprocessed output data are used for the calculation of Gradient Gx and Gy calculation and then apply Sobel or Prewitt operations using equations (6) and (7) respectively and store in results in register for the generation of Edge detection Image. When mode-1000 is actived for the Image Thresholding operation, the preprocessed output data is compared, if it is less than Constant 'T' then replace with Zero value and if it is greater than 'T' then 255 value will be updated and store any of the 0 or 255 values in register for the generation of Threshold Image. When mode-1001 is actived for the Image Histogram operation, the gray level block counts the appearance of the pixel values in the preprocessed output. The CDF block counts the each value in the array with previous one. The multiplier will multiply the CDF array value with previous gray value. Finally map the new gray values into number of pixels and store the mapped values in register for the generation of Histogram image of input Image.

AXI-Interface Module
The AXI-4 protocol is a high-speed interface bus protocol and which corrects the drawbacks of existing bus interface protocols. The overview of the proposed Highspeed AXI-4 Interface is represented in fig.4. The model includes AXI-4 Master Module; interconnect module, and AXI4 Slave module. These modules are interconnected, each based on AMBA-AXI-4 Input-outputs and its specifications. The AXI-4 master and slave modules are designed using FSM along with five transactions includes write address, write data, write a response, read address, and read data. The proposed module provides a high-speed interface to the image enhancement module (Main master) and dual-port memory module (slave module). The AXI-4 supports Quality of service signaling, an extension of burst length support, write-response updation and cache signals updation and out of order requirements. The AXI-4 performs the burst based transaction. The address channel provides control and addresses information of every transaction, which offers the type of data is being transferred. The Write data channel provides the data information details from the master to the slave module. The AXI-4 Master module is designed using five-channel transactions. The AXI-4 master module initialized by Write address valid (AWVALID) and Write address ready (AWREADY) signals. The AWVALID will be activated, followed by AWREADY, if the slave module is ready. Then the master module initiates the valid write address (AWADDR) transaction to slave for each transaction. The design considers increment address burst type (AWBURST) along with bufferable and cacheable (AWCACHE) transactions for address processing. The write data transaction will be initiated parallelly along with the write address. When the slave ready signal (WREADY) is ready, the write valid signal (WVALID) will be activated. The Write data (WDATA) contains image data transactions based on the address and performs the write data transaction until the last transaction (WLAST). The Write response valid (BVALID) from the slave, followed by writing ready signal (BREADY) from the master, will be activated and set to low for each data transaction is completed. Finally, the master writes response will set the OKAY signal for the data transaction completion.
The AXI-4 Slave module process is similar to write address, data and response transactions. The only difference is instead of write consider read signals, and master signals act as slave signals and vice-versa along with Reading response build with reading data transaction in the slave module.
The Slave Module contains dual-port memory, which is having an 8-bit width, and depth depends upon the image size. In the present work, 65536 memory locations are considered. The memory module stores the image and connects to the external world.

5
Results and Performance Analysis The AMF filter results for different noise levels are represented in fig.6, and the performance metrics of the AMF is analyzed in terms of PSNR and MSE along with the corresponding numeric values are tabulated in table 2. The different noise levels, like 1 %, 5%, 10%, and 20% salt and pepper noise, are applied to the input image represented in fig.6(a-d), and corresponding Adaptive median filtered outputs are represented in fig.6(e-h) respectively. 6(a) 6(b) 6 (c) 6(d) 6(e) 6(f) 6(g) 6(h) The filtered outputs are analyzed using PSNR and MSE ratio, which are used to find the effectiveness of the image quality. The input image is corrupted by different noise levels 1%, 5%, 10% and 20%, the corresponding PSNR (dB) obtained for AMF is 36.73, 35.94, 35.12 and 33.34 respectively. Similarly, the mean square error (MSE) obtained 13.79, 16.54, 19.97 and 30.12 for AMF with different noise-levels1%, 5%, 10% and 20% respectively. The quality of the image is good even for 20% noise corruption with better PSNR and MSE ratio, which improves the robustness of the AMF module. The input image and its histogram are represented in fig.7.  The imaging chip synthesized results are obtained after the place and route operation using Xilinx ISE software. The imaging chip master module hardware constraints include slice LUTs (chip Area), Maximum operating Frequency(MHz), Total power (W), and execution time(ms) are tabulated in table 3. The Sobel and Prewitt edge detection consumes more area than other operations because of edge detection operations. For the first five modes of operations, the maximum frequency is not generated because of the pre-processed 3x3 window output as an input to these operations. The AMF works at a maximum frequency of 1384.08MHz with pipelined architecture. The total power is generated using Xilinx -Xpower analyzer, which includes both static and dynamic power. For different image operations, The master module utilizes less power individually. The execution time is calculated based on the simulation results by providing the image as input to the imaging chip test cases. The execution time includes 256X256 image read and store in a temporary memory location, followed by the master module, interface module, and slave module output result till the last pixel. The imaging chip utilizes less execution time on an average of 1.31ms, which is quite suitable for real-time image processing applications. The Main imaging chip modulealong with IEM (master) module and AXI module'shardware constraints like area, frequency, power(W), and Memory (KB) resultsare obtained after the place and route operation, and it is tabulated in table 4. Table 4 shows the difference in hardware constraints for the image enhancement module with AXI (Imaging chip) and IEM without the AXI interface. The imaging chip consumes less area (Slice LUTs), operates at a better frequency, and consumes less chip power than IEM without the AXI interface.
The area utilization of imaging chip in terms of Slices Register and LUT improves < 1%, and LUT-FF pairs improve 44% IEM without the AXI interface. The Maximum frequency of imaging chip speeds up 3.2 % than IEM without the AXI interface. The power consumer of imaging chip improves 77% then IEM without the AXI interface. The total memory usage of imaging chip and IEM is 1317284 KB and 1309644 KB, respectively. The individual synthesis results of the AXI-4 Master-slave module is also incorporated in the same table 4. The Imaging chip offers high-performance computation than IEM without the AXI interface along with reducing the hardware complexity by introducing the AXI interface module in IEM. The comparison of real-time image processing application with different interface protocols are tabulated in table 5. The present design works at 140 MHz frequency along with utilizes less DSPs and BRAMs chip area than the recent existing image processing applications. The imaging chip process at execution time of 1.31ms and 85.5 Frames per second (FPS).

Conclusion
In this manuscript, the imaging chip model is designed using the AXI-4 interface protocol and implemented on an FPGA device. The imaging chip offers a high-speed interface for real-time image processing applications for communication. The Imaging chip has an Image enhancement module (IEM) Master, AXI interface protocol, and Memory (Slave) module. Imaging chip works based on the modes of operation for selecting the image enhancement algorithm quickly. The imaging chip is synthesized and implemented on Artix-7 FPGA, and Hardware constraints like Chip area, frequency, and powerare encouraging for real-time MPSoC usage. The imaging chip provides better resource utilization than IEM without the AXI interface. The imaging chip using the AXI interface works at 85.58Mbps, which is better than IEM without the AXI interface at 82.8 Mbps. The imaging chip provides better resource utilization and processing time than the existing recent real time image processing applications. The imaging chip model provides scalability, Robustness, and reconfigurable features, which suit for MPSoC applications. In the future, incorporate the security features into the imaging chip for strengthening from attacks.