A deep learning approach model was presented as a detailed experiment with evaluation steps undertaken to test the effectiveness. These experiments were based on CGL patient’s photography database. To build and train the convolutional neural network model, it was used Python 3 and some libraries to help, such as Numpy v1.17.4 and Tensorflow v1.15. All experiments were run on a standard PC without a GPU card and a i5-4210u processor.
This study was performed in accordance with the Declaration of Helsinki and was approved by the University Hospital Walter Cantídio Ethics Committee, Fortaleza, Ceara, Brazil (nº 5.364.464). All the patients and their families gave formal consent to participate in the study by signing the free informed consent form prior to their inclusion.
The dataset consists of two main categories (training and testing) and three subcategories containing photos of patients with CGL, individuals with malnutrition and eutrophic individuals with athletic build. A total of 337 images of individuals of different ages, children and adults were carefully chosen from internet open access database and photographic records of stored images of medical records of a reference center for inherited lipodystrophies. In the search for photographic records published on open access platforms, a literature review was carried out. The searches were carried out in the Lilacs, PubMed and Scielo databases. Descriptors and their combinations in Portuguese and English were used with Boolean operators: “Congenital Generalized Lipodystrophy” OR "Berardinelli-Seip Syndrome" AND “physiopathological mechanisms” OR “phonotype” OR “clinical characteristics”; “Malnutrition” AND “physiopathological mechanisms” OR “phonotype” OR “clinical characteristics”.
There was no standardization for the acquisition or selection of photographic images of the patients. The clinical history of the 22 patients followed up at the outpatient referral clinic, whose images were included in the analysis, was assessed through medical records.
Several data augmentation methods were employed to artificially increase the size and quality of the dataset. This process helps in solving overfitting problems and enhances the model’s generalization ability during training.
In order to carry out the data augmentation process, geometric transformation techniques were used. Some images were rotated and zoomed using angles arbitrarily chosen by the author. In total, 8 processes were chosen, 6 of which consisted of rotating 45, 90, 180º, -90, -50 or -45 degrees. And the other 2 consist of zooming the image and rotating 18º or 114º. At the end of the process, a database was obtained with a total of 896 images.
The architecture of the proposed CNN model consists of two major phases: the feature extraction and the classification (Fig. 1). However, before images are sent to the network, it is necessary to make transformations to the images. These consist of converting to grayscale, resizing and normalization. After carrying out these transformations, it is necessary to concatenate each transformed image in a large matrix, which is sent to the first layer of CNN. In this work, the feature extraction phase consists of using two pooling layers with a 2x2 filter and two convolutional layers with the ReLu function to increase the non-linearity of the output and each layer of the attribute extraction phase receives information from the previous layer, the data in the output layer, in turn, is passed as input to the subsequent layer. At the end of the first phase, and with the model already trained, the classification of patients with or without CGL is made by the fully connected network, and for this work we used three hidden layers with 1024 artificial neurons at each.
The hyperparameters used to configure the CNN are shown in Table 1. In this, it is possible to identify that the amounts of convolution and hidden layers are smaller than the amounts of neurons per layers. This motivation was due to the fact that the computational cost increases exponentially when increasing the number of layers. Another hyperparameter that can be highlighted is the activation function, which is present in all neurons, except those present in the output layer, which use the sigmoidal function.
Table 1
Number of convolutional layers
|
2
|
Filter feature order
|
[2x2]
|
Number of pooling layers
|
2
|
Learning rate
|
0,001
|
Number of neurons per layer
|
1024
|
Turn-off neurons percentage
|
0,2
|
Activation Function
|
ReLu
|
Maximum Number of Epochs
|
600
|
Number of hidden layer
|
5
|
For validation, the dataset was partitioned into four parts, keeping the same proportion of the three subcategories in each part. The 4-fold cross-validation technique was applied, using 75% (3 parts) of the data as training and 25% (1 part) as a test. Following the technique, 4 tests were performed, changing the parts that were used as training and testing until each part was used exactly once as validation data (Fig. 2).