Knowledge vector representation of three-dimensional convex polyhedrons and reconstruction of medical images using knowledge vector

Three-dimensional image construction and reconstruction plays an important role in various applications of the real world in the field of computer vision. In the last three decades, researchers are continually working in this area because construction and reconstruction is an important approach in medical imaging. Reconstruction of the 3D image allows us to find the lesion information of the patients which could offer a new and accurate approach for the diagnosis of the disease and it adds a clinical value. Considering this, a novel approach is proposed for the construction and reconstruction of the image. First, a syntactic pattern recognition-based algorithm is implemented to extract the features from the 2D image. The proposed algorithm takes an input image and extracts the features from the image and these features (Knowledge vector) consist of direction code and length. In addition, a unique and novel algorithm is developed that can rebuild an image using a knowledge vector. Reconstruction allows us to investigate the interior details of 3D images, such as the object’s size, form, and structure. The proposed algorithm performance is assessed on a medical imaging dataset, and the findings are outperformed. Performances of the proposed algorithms are evaluated on Kaggle brain MRI dataset and Medical MRI datasets which is collected from Pentagram research institute, Hyderabad. As per the experimental analysis, the proposed method gives 93.89% of accuracy on Kaggle brain MRI dataset and 97.24% on Medical MRI dataset which is better than exiting state of art methods.


Introduction
Pattern recognition encloses two primary tasks: description & classification.Pattern recognition system generates the description of the object which is present in the image and then classifies the object according to the description.There are two different approaches for implementing pattern recognition: statistical and structural.These two techniques use a different method for description and classification purposes.Statistical approach [8,13,20] use decision-theoretic concepts for the discrimination between the objects which belongs to different class based on their quantitative features.Structural pattern recognition approach [12,15,26] sometimes referred as syntactic pattern recognition approach because it uses syntactic grammars concept for the discrimination between objects which belongs to different class based upon their shape, size or structural features.Some researchers combined both structural and statistical approach for pattern recognition.The statistical approach is well explored by many researchers but structural pattern recognition still doesn't explore adequately because it requires domain knowledge for the description and classification purpose.Therefore, Structural pattern recognition techniques are domain-specific [5,[29][30][31].
In this paper, we proposed a novel approach basis on the Syntactic approach which represents the three-dimensional convex polyhedrons in textual form.This textual representation consists of the direction and length code of the objects which are present in the image.The textual representation is referred to as the knowledge vector.Further, this knowledge vector is considered as input and a novel algorithm is written in such a way that reconstructs a three-dimensional image using the knowledge vector.This is the First time in the field of image processing where a threedimensional image can be reconstructed using textual information and it could be a great contribution in the field of structural pattern recognition as well as adds a new value in the field of image processing.As we discussed already that structural pattern recognition techniques are domain-specific so that we considered only medical images for our research.

Motivation and objectives
The primary motivation behind this research is to develop a novel algorithm that converts an image into a textual form using a syntactic pattern recognition technique and the secondary objective of this paper is to reconstruct a three-dimensional image using a knowledge vector.The motivation behind this research is to explore the new possibility of structural pattern recognition and this research can give a new direction to understand the three-dimensional images in textual form.
This paper is divided into 4 sections.The second section covers the motivation and objective of the research.Previous related work is discussed after the introduction.Third, reconstruction fidelity even if the object consists of a heavy curved and concave structure and the method can preserve the smoothness and cleanliness of the surface.This method is suitable for 3D printing of the images.Yuanhao Guo et al. [16] proposed a two-phase three-dimensional reconstruction method for light microscopy axis view images.In the first phase, an improved 3D volumetric representation is defined and in the 2nd phase, 3D reconstruction is achieved by searching the optimal surface over the confidence mapping.The performance of the proposed method is evaluated on three different datasets and the approach can represent 3D shape in a precise manner.This method can be used to determine the shape of the objects.Lu Ding et al. [6] proposed a method for optoacoustic image systems that helps to find the shape and size of the image.These methods have significant improvement in the spatial resolution, a contrast to noise ratio, and quality of the reconstructed image is reported better than existing techniques.Bin Li et al. [24] proposed a 3D-ReConstnet network that used a residual network for the extraction of the features from the 2D image.If objects are occluded in the image then the Gaussian probability distribution method is used for the learning purpose.The performance of the proposed method is evaluated on ShapeNet and Pix3D dataset and the results are comparatively better.Volumetric representation is an important approach to three-dimensional reconstruction.The main goal of volumetric representation is estimating a convex hull in three-dimensional spaces from each view [7,10,11,22].In the same way, the space carving algorithm is to recover the shape of the 3D objects by removing the voxels which are not visible in the particular view [14,21].But these methods require the proper segmentation and it is not always possible in some scenarios.There are many well designed three-dimensional reconstruction algorithms are available.Some algorithms are only designed for the reconstruction of peculiar and transparent objects [19].These methods outperform on a macroscopic scale because to collect the surface-related information they setup the various lightning while capturing the images [25,27].In recent research [33] authors have presented a semantic reconstruction which is a combination of a data term and regularization constraints and results are outperformed on public datasets [34].Now a day's researchers are focusing on deep learning-based methods for improving the matching quality [17,36] and for the reconstruction of 3D images but these methods require a large amount of training and testing data which are not available in our case.
After study various papers on 3D reconstruction, we conclude that volumetric based approaches are giving more promising results and properly addressing the challenges of 3D reconstruction [28,32].This is also true that in some cases the shape of the 2D/ 3D images is not accurate as expected.Therefore we developed a novel approach that is working based on 3D convex polyhedrons.The 3D image consists of voxels and each voxel itself represents a cube.So a 3D image is an array of cubes.Here we are considering the structuring element of size 3X3X3 so it consists of 27 neighborhood structures and this 3X3X3 size of the structuring element is used for processing of a 3D image.The whole 3D image is scanned using the structuring element and an algorithm is developed in such a way that it converts the image into a textual form which consists of direction code and length code and it is called a knowledge vector.This knowledge vector preserves the shape of the objects which is present in the image.Further, the same knowledge vector is used as an input for the reconstruction of the original 3D image.The detail of the proposed method is given in the further section.

Proposed methodology
The proposed method is designed for 2D and 3D images and an image could be a simple image that consists of several objects or it could be medical images that are more complex and contains a lot of hidden information.There is a possibility that noise could be present in the image.Therefore, noise is removed first from the image using adaptive median filter (Table 1).The sample results and algorithm are shown below: Once the de-noising has been done then we applied the following strategy on all the input dataset (Fig. 1) I. Knowledge acquisition II.Feature Extraction III.Reconstruction of 3D images

Knowledge acquisition
Knowledge acquisition is a method on which domain knowledge is acquired for extracting the features of a specific application or domain.This is the first step of structural pattern recognition based methods.Generally, Structural approach uses syntactic grammars concept for the discrimination between objects which belongs to different class based upon their shape,

a. Syntactic Pattern recognition
The new and less explored approach of pattern recognition is syntactic pattern recognition which consists of the formal language theory concepts.Syntactic pattern recognition term is a synonym of the linguistic, grammatical, and structural pattern recognition system.The formal language theory was originated by Noam Chomsky in the 1950s with the development of a mathematical model of grammar.The concept is described as follows: Alphabet is a finite set of symbols.A word is any finite string that consists of symbols from the alphabets.For example a valid word of alphabets {a, b} is {a, b, ab, ba, bb, aa….}.A word with no symbol is an empty word that is denoted byΛ.A language that consists of a set of words over a finite set of alphabets.Every natural language follows some grammar rules.The grammar of the formal language consists of four tuples G ¼ P N ; P T ; P r ; R f g : Here P N Set of variables or non-terminals.P T Set of constants or terminals.P r set of production rules.Here P N and P T are disjoint set and R s always belongs to either P N or P T. Set of empty words denoted by P* which is also called free Monoid.
The language which is generated by these tuples G is called L(G).It should satisfy two conditions: Each string should only consist of terminals.Every string should derive from the root R s by applying the suitable production rules.Production rules consist of expressions in the form of X → Y.It means string X can be replaced by string Y.
Where Root R can be changed according to the object position.Here these symbols represent the direction in which a pixel could be present.Here R, D, L, U, B, and F are elementary symbols, and DR, DL, UL, UR, BR, BDR, BD, BDL, BL, BUL, BU, BUR, FR, FDR, FD, FDL, FL, FUL, FU, FUR are composite symbols of the alphabet P T .The set of production rules are generated through the Normal algorithm which was introduced by A.A. Markov.The Normal algorithm is used to recognize the angle of change in a particular direction during tracking of the contour.This is done with the help of the look-ahead tracing (LAT) method.
b. Picture Description Language (PDL) Picture description language is an application of semantic/linguistic concepts of pattern recognition.The pattern can be represented in the form of string grammar and this string grammar can be obtained using a simple juxtaposition of a string.Juxtaposition is nothing but keeping the two objects together without losing their identities.This can be done with concatenation also but it involves some spatial arrangements and there is a possibility of losing the identity or losing some information related to the objects.Juxtaposition is easy and reliable because it only involves the head and tail position.Based on this permissible form of juxtaposition and because each primitive is abstracted as a directed line segment, it is evident that the structure of PDL are directed graphs and also that these structures can be handled by string grammars.Blank primitives must be used for generating disjoint structures.A null point primitive has an identical head and tail.The mechanics of PDL could be used to obtain the contour/ wireframe of a pattern.To illustrate this mechanism, consider the following PDL grammar: Where P N {R s , P 1 , P The result obtained by applying all the productions is The above production system creates an octagon as represented in the below Fig. 2: The directions of the pixels are defined with the help of production rules.Suppose an alphabet A is given then A* could be constructed over A which consists of all the possible words containing A and null word also.So any subset of A* is a formal language.According to this theory, any digital image is consists of a set of patterns and the same digital image can be represented as a language that consists of a finite array of vertex/ vertices.As mentioned, the PDL is a language of representing the digital image into the regular array of vertices.

c. Structuring elements of 3D images
The size and shape of a 2D/ 3D processed image always depend on the structuring elements which have been chosen for processing the image.In the case of a 2D image, if the window size/ neighborhood structure is 3X3 then a maximum of 16 convex polygons can be present.For a 3D image if the window size/ neighborhood structure is 3X3X3 then a maximum of 256 convex polyhedrons can be present.In the same way, we can find the number of concave polygons and polyhedrons.These polygons and polyhedrons are called a structuring element which can be used for the processing of 2D and 3D images.The number of structuring elements always depends on the size of the neighborhood.This concept was introduced by Rajan (1990) and Jirawit Lerdsinmongkol (2008).Suppose an image with dimension (m x n x d) where m is the width of the image, n is the height of the image and d is the depth of the dataset.The size of the structuring element is 3X3X3 so there will be a total of 27 neighborhoods including central pixel.In the same way, we can have different structuring elements which are represented in Fig. 3.In our research, we used a neighborhood structure of size 3X3X3 so we have 256 possible structuring elements.A detailed description of 3D convex polyhedrons is given below:

Three-dimensional rectangular convex polyhedrons
The idea of constructing 3D convex polyhedrons is the same as 2D convex polygons.As we discussed earlier, a 3D image consists of voxels and each voxel itself represents a cube.So a 3D image is an array of cubes.Here we are considering the structuring element of size 3X3X3 so it consists of 27 neighborhood structures.The smallest size of possible convex polyhedrons consists of 7 neighborhoods.The labeling of the 3X3X3 array and their possible coordinates are represented further in Fig. 4 and three different planes are considered where the k th plane is the central plane and (k-1) and (k + 1) is the front and rear plane.
In a 3X3X3 size of array total of 8 corner pixels are present hence there is a possibility of any corner pixel would not be available or any two corner pixels would not be available etc. Therefore, there are 256 different possible ways of constructing the convex polyhedrons w. r. t central pixel 14.The formula is given below: Here, these convex polyhedrons are divided into 9 categories: & When all the corner pixels are present then the number of possible convex polyhedrons will be 8 C 0 which is equivalent to 1. & When one of the corner pixels is not available then the number of possible convex polyhedrons will be 8 C 1 which is equivalent to 8. & When two of the corner pixels are not available then the number of possible convex polyhedrons will be 8 C 2 which is equivalent to 28. & When three of the corner pixels are not available then the number of possible convex polyhedrons will be 8 C 3 which is equivalent to 56. & When all the corner pixels are not available then the number of possible convex polyhedrons will be 8 C 4 which is equivalent to 70. & When all the corner pixels are not available then the number of possible convex polyhedrons will be 8     & When all the corner pixels are not available then the number of possible convex polyhedrons will be 8 C 6 which is equivalent to 28. & When all the corner pixels are not available then the number of possible convex polyhedrons will be 8 C 7 which is equivalent to 8. When all the corner pixels are not available then the number of possible convex polyhedrons will be 8 C 8 which is equivalent to 1. Further, all the possible 256 convex polyhedrons are represented in Table 2 and here A, B, C, D, E, F, G, H and I represent the Possible eliminated pixels, and group P, Q, R, S, T, U, V, W, and X represents the possibility of convex polyhedron after removing the pixels.
The visualization of 256 convex polyhedrons which are listed in the above table is shown in Fig. 5.

Feature extraction
Extracting the features from the image is an important step in object recognition.In the proposed method we used picture description language for the representation of the features.The image is traced pixel by pixel which starts from pixel position (1,1,1) to find the initial foreground pixel (x i , y i, Zi ) with the intensity value 1.This will identify the initial point of the first component present in the image.The proposed algorithm is written in such a way that it will find the next neighbor of the current pixel in all the preferred direction.Preference will be generally given to the right side of the pixel (from the current pixel).These 26 directions are already represented in Fig. 2. The naming conventions of the preferred direction are: Right(R),  In this method, the previously recognized direction has given priority compared to other preferred directions.The tracing of each pixel will continue until the algorithm will not find any connected neighbor pixel to the current position (x, y) and will remove the pixel which is already traced.If there is no neighbor pixel then the algorithm will display the knowledge vector of that component and again algorithm will continue until it will not trace all the components which are present in the image.If all the components are traced then the final knowledge vector will be displayed.This knowledge vector is nothing but a feature vector that consists of the direction and length code of each component and this knowledge vector will give the information that some objects are present in the image.The detailed algorithm is explained in the Appendix 1.
Fig. 5 Three dimensional visualizations of convex polyhedrons GUI is created to provide the better interface to the user.Sample image of GUI is shown in the Fig. 6.

Reconstruction of original three-dimensional image using knowledge vector
In the previous step, the knowledge vector of the image is identified which is nothing but features of the image.In this step, the feature vector is considered as an input and a novel algorithm is written which converts the knowledge vector into the image.Further, the reconstruction algorithm is explained in the Appendix 2.To make the model easy to use, a graphical user interface (GUI) was made where the text file can be chosen as an input.Once the text file is chosen, the original image is shown on the screen.The same image is shown in Fig. 7.

Result analysis
Two algorithms are implemented for the proposed work and both algorithms are implemented in python using the Anaconda tool.The experiments are done on a computer with 64GB RAM, 4GB NVIDIA Graphics card, i7 processors, and windows operating system.Fig. 6 GUI for creating a knowledge vector from image

Datasets
The proposed approach is evaluated on two datasets.The first dataset consists of five different folders.All five datasets consists of medical images and a detailed description is given in Table 3.This dataset is collected from Pentagram Research Centre Private Limited, Hyderabad.The second dataset is 'Kaggle Brain MRI dataset' which is publicly available on Kaggle [3].

Evaluation parameters
To evaluate the performance of the proposed methodology some standard image processing evaluation parameters are considered.The First parameter is Accuracy which means how accurately the algorithm can find the feature vector of the image and at the same time how accurately the algorithm can reconstruct the 3D image using knowledge vector.
The other most common parameters of evaluating the quality of image reconstruction are Peak signal to noise ratio (PSNR), Mean squared error (MSE), Structure similarity index(SSIM).These parameters work on the pixel-wise comparison between the original image and the reconstructed image.PSNR (based on the MSE metric) is a ratio between the maximum possible power of a signal and the power of corrupting noise that affects the fidelity of its representation.The power of corrupting noise is measured by MSE and SSIM is considered as the mean and variance of the image intensities.PSNR is defined as: Where Fig. 7 GUI for displaying an original image from knowledge vector can extract the features in such a way that it is not affecting the quality of the image while reconstruction (Table 5).Many researchers are working on the reconstruction of a 3D image because the minimally invasive procedure is getting popular nowadays (Fig. 8).After all, it provides better accuracy and minimum recovery time.In the above-mentioned method, the surgical navigation system will always play an important and critical role.It helps the doctors to focus on a particular position, edge, or outline of the image.But Most of the surgical navigation systems are based on two-dimensional medical images but very little development in the field of threedimensional medical images.Therefore, a platform is developed for the reconstruction of the three-dimensional images.This System takes the series of 2D/ 3D medical images as an input and converts these series of images into textual form first and then in the second stage this textual information/knowledge vector would be taken as an input and reconstruction of the 3D image would be done in this stage.It displays the human organs on the computer screen which gives convenience to the doctors for the diagnosis of the disease.This three-dimensional reconstruction gives the facility to doctors to focus on a particular point and find the nature of the lesion and its surrounding tissues, rotation of the image, traverse the image from inside, zoom in and zoom out, thus it helps the doctors to diagnose the diseases and its severity and improve the accuracy and decrease the diagnosis time.The result of the proposed method is evaluated on the basis of qualitative analysis and quantitative analysis.PSNR and SSIM are calculated on all five datasets and it is represented in Table 5.Most of the reconstruction algorithm takes more time for reconstruction of 3D images.As compared to the existing methods, proposed method takes very less time in the reconstruction of the 3D images and results are shown in the Tables 6 and 7.The other parameter of evaluation for the performance of the proposed algorithm is Accuracy.As per the experimental analysis, the proposed method gives 93.89% of accuracy on Kaggle brain MRI dataset and 97.24% on Medical MRI dataset which is better than exiting state of art methods.The average execution time on Medical dataset and Kaggle brain MRI dataset is 1.12 sec.and 4.02 sec which is again lesser than the existing methods.

Conclusion and future work
In this paper, we have presented a syntactic pattern recognition based approach that can represent the image into textual form and a novel algorithm is proposed for the reconstruction of three-dimensional images.Reconstruction allows us to explore the internal details of the 3D images such as the size, shape, and structure of the object which could take us one step ahead in the field of medical image processing.GUI is set up so that the image can be seen correctly, and it has features that let the image be rotated in all directions and zoomed in and out so that the image's inner details can be seen clearly.Performances of the proposed algorithms are evaluated on a medical image dataset and results are outperformed in real-time.As per the experimental analysis, the proposed method gives 93.89% of accuracy on Kaggle brain MRI dataset and 97.24% on Medical MRI dataset which is better than exiting state of art methods.This approach is better than other existing approaches because there are very few steps of reconstruction of a 3D image which makes this approach unique.The classification task of a structural pattern recognition system is difficult to implement because the syntactic pattern recognition embodies the precise criteria which discriminate among groups and, therefore, they are by their very nature domain-and application-specific.In the future, we will try to extend the paper by implementing classification techniques for object recognition.

Fig. 1
Fig. 1 Flow chart of the proposed methodology

Fig. 2
Fig. 2 Pictorial representation of a Center Plane b Back Plane and c Front Plane with corresponding pixel directions

Table 1 Tabulation of medical images using adaptive median filter Original Image Adaptive Median filter Original Image Adaptive Median filter
String in P + Y String in P *In the proposed methodology, the set of terminals or constants are denoted by: P T ¼ R; DR; D; DL; L; UL; U; UR; B; BR; BDR; BD; BDL; BL; BUL; BU; BUR; F; FR; FDR; FD; FDL; FL; FUL; FU; FUR :

Table 2
Representation of a possible combination of convex polyhedrons

Table 5
PSNR and SSIM value on Comparative analysis of accuracy and execution time

Table 6
Comparative analysis between proposed work and other state of art methods in terms of accuracy and execution time on kaggle Brain MRI dataset

Table 7
Proposed and current approaches comparison for accuracy and execution time on five medical datasets