Human Identification with their VOC distribution through CMS – SEN Model

Smell printing or odor printing is a novel morphological characteristic that an object can be defined by its odor. Human body odor is one such biological trait that yields less error rate of 15% among other biometrics. The human odor printing or smell printing possesses significance against the world towards screening of security checkpoint, searching for survivals under rubbles, investigating criminals, and many more. Cogno-monitoring system (CMS) is a specific prototype to furnish two essential processes-odor analysis and odor encoding through the Sensing-Encoding-Notifying (SEN) model to give the sensitivity and specificity score among the individuals. Human body odor can be interpreted as the alliance of various volatile organic compounds (VOCs) and they are recognized, classified in the encoding process. This article exhibits a detailed analysis of the traditional detection methods including bio-analysis concerning the human body human body odor experimented with 6 people. By applying principal component analysis along with random forest classifier, the VOCs distribution of the individuals is measured. This work classifies VOCs of different individuals with 81.3% accuracy which becomes the plinth for the identification of humans.


Introduction
Odor printing or smell printing elongated its direction towards developing the olfactory detection system to recognize the aroma of an individual which will be differentiated from others and persists as one's unique biological property. Research is carrying out over the last five decades on these types of systems to make a machine learn about the identification of human odor. They strongly perceived as Electronic Noses or E-Noses, which are assisting for several applications for inspecting agriculture (Jansen et al. 2010), measuring air quality index (Zampolli, et al. 2004;Prathyusha and Chakravarthy 2020), environmental monitoring (Cai et al. 2010), food quality assessment (Peris and Escuder-Gilabert 2009;Zhang et al. 2008), medical diagnosis (Kateb et al. 2009) and many more. E-Noses have extended its applicative scenario towards the detection and recognition of humans by their redolence. Authors employed a collection of metal oxide sensors in a networked E-Nose (Wongchoosuk, et al. 2011) to detect a human odor gathered from armpit, which is a primary source of odorants. To achieve patterns and discernment, Principal Component Analysis (PCA) is applied with a proposed scheme to correct the sensor drift. Likewise, equipment of E-nose (Wongchoosuk et al. 2009) has been designed to measure the sensitivity of each metal oxide sensor by using a voltage divider resistor to classify and identify volatile organic compounds from human armpit body odor, which is controlled by in-house developed software for fixing humidity noise over a compact USB data acquisition card with PCA.
VOC analyzed through sweat that is concealed from the surface of the skin of the patient employed to detect, diagnose and monitor disease through an e-nose (Voss et al. 2013). To distinguish the patient groups, PCA with discriminant function analysis is applied to achieve 87% accuracy. VOCs sources include breath samples used to diagnose illness and identifying pathogens from culture through an e-nose (Gardner et al. 2000) used for analyzing biological oxidation of organic compounds which is called cell metabolism.
Bio analysis of human odor and the traditional methods used for analyzing and recognizing VOCs are projected in further sections followed by prototype of Cogno monitoring system. The odor analysis and encoding procedure is explained in SEN Model along with algorithm followed by the experimental setup of CMS with the results are presented in the next sections. By applying principal component analysis, the odor analysis is achieved to draw the correlation between different features of VOCs (Odor Threshold Value, Odor Intensity, Odor persistency, Hedonic Sensations, Odor Characterization) along with random forest classifier, the VOCs distribution of the individuals is measured (the odor encoding process is carried out) and classifies VOCs of different individuals with 81.3% accuracy which is used to identify individuals.
2 Bio-analysis of human body odor: A human body odor Bio-analysis comprises the absolute study of odor that provides a characterization of each chemical compound of the mixture which will be evaporated at room temperature. According to the research (Pandey and Kim 2011a), Bio-analysis of odor is categorized as sweat (produced from the skin odor), odor released from human excreta (e.g., urine) and the odor released from the oral cavity (e.g., breathing). It also depends on various factors based on diet of an individual, pressure, immune status and their genetics may allow the discharge of the complex group of non-volatile and volatile molecules by humans. Human body odor is secreted through apocrine, eccrine, sebaceous glands and produces volatile organic compounds. These volatile organic compounds deliver primitive sense in human body odor which brings the uniqueness of an individual. Basic human body odor possesses low or moderate vapor pressure with the chemically stable feature. The epidemiological studies review that body odor of individual influences, the physiological features like heart beat rate, blood pressure in certain scenarios, respiratory condition rate. Skin acts as conductance and causes significantly impact physiological outcomes like irritant symptoms, mood, and cognition (during task performance). The behavioral information of individuals can be used to inquire about their health and emotional status that prove many kinds of research regarding human discrimination by their odor. The odor is a perfect mixture of the odorants or chemical compounds emitted from different parts of the body like armpit, feet, breath, and stomach. Body odor chemistry has revealed these facts in Table 1 (Chemistry and of Body Odours -Sweat, Halitosis, Flatulence Cheesy Feet'', Compound Interest, 2020; Pandey and Kim 2011b). Each scent is basically in two forms-'smells like', 'feels like'. Among these different secretions of chemical compounds from multiple body parts, the one which secreted from armpit determines a unique pattern helps to identify and detection the individual in various scenarios. Out of all these 12 compounds, 8 compounds are distinct and 4 are of predominant nature of their existing over different parts of the body. The volatile sulphur composites are detected by a human nose with a threshold of 0.00047 parts per million. whereas utmost, the human scent is dominated by volatile organic compounds. It is observed that the source of these VOCs is from the axillae of individuals. Due to the environmental conditions, internal body conditions an individual body odor may change from time to time.
However, it is still uncertain to explain the diversification of odorants with their peculiar characteristics. From the previous work of the authors (Kanakam et al. 2017), all the odorants that are volatile in nature will be evaporated at room temperature and weighed from 30 to 300 gm/moles depending on the distribution of molecules in a specific compound. It has been observed (Li 2014) that, (E)-3-Methyl-2-Hexenoic is highly ranked compound found in most of the individuals and also the odorants are unsaturated acids, alcohols, aldehydes, ketones, steroids and carbonyls that form as either straight or branched chain of C 6 -C 11 .

Analyzing and detecting techniques
Various techniques are analyzed qualitatively (where the presence of the compound is detected among mixture), quantitatively (the quantity of the compounds are determined) and structurally (how the molecules are distributed among the mixture). Determining the fingerprint of the human body odor can be detected through different qualitative analytical techniques with their predominance in identifying chemical signatures of human body odor depending on the unique information of VOCs in body odor. The procedure begins with the odor sampling for separating VOC compounds in any given mixture. The probability of detection of compounds in a mixture relies on the sampling and pre-concentration processes of particular technologies (Li 2014). Even though a single method of analyzing technique is enough to detect the molecules, the purpose of using the combination of techniques is to predict the unpredictable nature of VOCs in a human body odor.

Solid Phase Micro-Extraction (SPME)
This technology that produces the sample to be unobstructed dissolvable operates in a quick, efficient and adaptable manner. SPME uses a fiber coated polymer or sorbent, where polymer utilized to manage liquid sample and sorbent for handling solid sample or a mix of both. Most of the researches were used di-vinyl benzene (DVB)/ carboxen (CAR)/ poly dimethyl siloxene (PDMS), where DVB, CAR were used as solid sorbents and PDMS used as liquid sorbent.
Then the compounds in each liquid sample or solid sample are absorbed to their respective coating by the extraction of fiber. Then the SPME fiber is inserted into chromatograph for further investigation. The applications of this technique have extended to numerous domains to detect flavors, scents, forensics, toxicology, environmental and organic matrices.

Gas chromatography (GC)
It's an analytic technique to separate the molecules which evaporates at room temperature. These molecules are well noted as volatile organic compounds as a mixture like methanol, acetone, and heptane. This system associated with a column, mobile phase and stationary phase components. Before analysis, the sample that is extracted can be passed to the column through an inlet where the temperature of the column is maintained uniformly and should be 50°C more than that of room temperature to achieve efficient detection of the chemical compounds. Separation of the molecules takes place between the mobile and stationary phases whereas all the light-weight and more weight molecules will interact with the stationary phase and mobile phase respectively and transfer quickly from the column. Once the separation is completed, the detector will be attached at the end to detect the compounds in a sample.

High-performance liquid chromatography (HPLC)
It is a column modified analytic system performed at high pressure of 40 mPa (mega pascals) used to pump mobile phase molecules into the column with absorbent material coated stationary phase. The column is filled with a small size absorbent material particle. It gives a large surface area with high resolution to these molecules for the ease of interaction.

Mass spectrometry
The analytic technique in which the molecules are converted to ions based on their mass to charge ratio. It deals with production, acceleration, separation and detection ions by using ionization chamber, vacuum pump and analyzer to calculate mass to charge ratio. The pump passes the sample to the ionization chamber with a pressure of 1l torr, where the sample of molecules get separated as ions with either filament functionality or Ammonia/methane ions and inert gases (Argon/Helium/Xenon) interactions. These ions are analyzed through a mass analyzer where mass to charge ratio is analyzed and detected through a detector. The entire process of a qualitative analytic task depends primarily on extraction, analyzing, detecting of the compounds in the mixture. Table 2 lists various techniques involved in different sub-tasks as mentioned. For each analytical method, compounds and their type are defined. Currently, the pattern recognition system and sensor arrays cloning the human senses are the potential advancement in the technical and commercialization direction. Research is in an on-going process to develop technologies for detecting, recognizing the primary compounds in numerous odors and flavors. The established receptor systems human olfaction, canine olfaction comprises several steps of recognition process includes identifying, comparing, quantifying with data storage and retrieval process. Moreover, knowing that it is relevant to individual opinion, hedonic perception is a characteristic of the human nose. Before-mentioned instruments have undergone substantial development and have not met industrial needs at present. CMS is an artificial sensing system used for chemical fingerprinting or odor printing where the specific chemicals are identified from a complex compound. These chemicals are specialized as monomolecular compounds that either odor dominated by strong principal component or gives a mixed aroma by the combination of all the odors. Authors in their previous work described (Li 2014;Kanakam et al. 2018) the traditional receptor systems that imitate olfaction process to project four distinct methods-pre-processing followed by feature extraction, classification and ultimately, decision-making process consolidated into SEN methodology.
CMS is the base Electronic Nose (E-Nose) prototype that implemented based on two essential units-Sensing-Encoding, Pattern Recognition Schemas as shown in Fig. 1. It is the qualitative analytic system based on the quantity of the compound that includes sample extraction, analyzing and detection of chemical compounds in human body odor. These VOC from human body odor is sampled and injected through the electric valves where the analytical sample preparation is carried out. To perform any extraction technique, two phases which lead to sample (a mixture of VOC from human body odor) and extractant (analyte used for separation of compounds). The primary objective of sample extraction is the isolation of the analyte of interest, the cleaning of the sample, and the pre-concentration of the analyte of interest. The mass flow controller resembles the mobile phase in GC where it holds the inert gas collection to carry the samples and connected to the column where the uniform temperature is maintained from 150°C to 300°C using an oven to isolate the compounds from sample. Sensor chamber contains an array of sensors so that each isolated compound is distinguished by its respective sensor. The information related to the odor compounds are accumulated into Data Acquisition (DAQ) card that acts as an analyzer. As the smell printing deals with organic compounds which are more volatile in nature may vary their distribution. In the pattern recognition unit, feature extraction, classification and decision making is carried out as the sub-tasks. The feature extraction deals with the process of qualitative analysis of compounds in a

Sensing-encoding-notifying (SEN) model
It furnishes the standardized approach to bring out step by step tasks described in the CMS. The pivotal philosophy of SEN model lies in projecting the overlapping sensitivities of traditional olfaction systems that contain olfactory epithelium and olfactory receptors that relate to the response of chemical compounds after pre-processing, sensing and encoding process which is acquired from a sensor chamber and analyzed and stored the information in DAQ card. This SEN Methodology includes three basic processes includes sensing, encoding, and notification. SEN algorithm indicates the executable script to be performed step by step to operate 3 primary functions-Sensing in which analyte selection, sampling, filtering, preconditioning of column is executed, whereas Encoding deals with ions distribution and their recognition, then concluded with notifying the results obtained. Features extracted are identified based on their peak area or peak width which directly points to the concentration of the compound. There are two parameters need to be measured after completion of the subtasks. During the sensing phase, sensitivity is attained as output and during the encoding phase, specificity is attained as output. The detailed procedure will be explained in the experimental section.

Experimental discussion
Most of the investigations employed the direct or indirect collection of a sample of human body odors (Cuzuel, et al. 2017) where direct collection involves the interactions of cotton/gauze pads with the skin (from a prime source of their body odor, armpit) (Sabri and Alfred 2017) after continuous monitoring of volunteers throughout 5 to 7 consecutive days. During this treatment, they are instructed to control their ordinary life activities such as the use of cosmetics, sexual activities to obtain the qualitative form of sample. Indirect sample collection encompasses the method of collecting VOC by using flow rates of air sorbent and the design of the material used for the sample collection. As the compounds have the drift of evaporation at room temperature, either massive pumping or most limited pumping leads to inappropriate results. The sorbent role is to break the VOC sample into separate molecules but not to react with them to form new ones. The qualitative and quantitative measures employed in selecting sorbent or analyte applied in these injection procedures. Supercritical Fluid Extractor (SFE) was mostly used pretreatment process found in several considerations. The consequent extraction and identification of compounds can be done through the Solid Phase Micro Extraction (SPME) method that uses either DVB/CAR or PDMS fibers (Curran, et al. 2007). The thermal sorbent Tenax/carbonograph tube is employed in another study (Sabri and Alfred 2017) to extract VOCs which is collected through cotton pads by heating the sample up to 90°C. A Nalophan sampling bag is employed for the collection of samples of headspace SPME, contact SPME, liquid-liquid extraction and the other dynamic sorbent methods-Dynamic headspace sorbent tube sampling (DHS). Out of all these styles, SPME has a potent position in providing fast and efficient sample extraction. However, the lower amount of VOC is isolated efficiently by DHS (Dormont et al. 2013;Zhang et al. 2005).

CMS Setup
The GC setup has made to build CMS with the silicon tube column of 30 m long with 0.25 mm diameter. Helium acts as the carrier gas for the sample mixture (mass flow controller) which is used in the mobile phase to transfer the sample within the column employing the stationary phase. The column is packed with silica gel to give biosensor prototype for separating and identifying the VOC from the mixture. The initial temperature of the room is at 25°C and 75 mm of Hg is considered as vapor pressure and temperature in the column should be 50°C more than that of room temperature. Sample collected is injected through a 5v diaphragm pump to isolate the compounds from the mixture. This experiment is taken among 6 individuals (4 males and 2 females of average age (20-40 yrs). SEN Table 4 Distribution of VOC among individuals model parameters (sensitivity and specificity) is calculated and graphically shown in Fig. 2a, b. The observed and analyzed compounds are categorized as classes depend on their chemical formula. From the synthetic data taken, 80 compounds are observed (totally among 6 individuals) out of which 15 compounds are identified as frequent in all the volunteers and the groups of which the compounds are categorized in the Table 3. 5 aldehydes, 5 ketones, 3 noncarbon mixed compounds, 1 acid and 1 alcohol type. It is observed that 14 compounds are of group1 (c2-c10) unsaturated fatty acids and one is sulphur compound where the carbon is absent and are common in all the individuals (Kanakam et al. 2017) (sulphur compounds does not have carbon compound in its structure). The sensitivity of male 4 is calculated and shown in Fig. 2a, whereas specificity ratio is drawn and shown in Fig. 2b, among the male1 and female 2 for all the Common Class VOCs compounds of the sample. It is observed that 80 compounds listed in the Table 4 are analyzed from 6 individuals. All the compounds categorized as separate classes and out of which 15 compounds are common (highlighted) with all the individuals. The presence of the compound in their respective body odor tabulated.

Results
According to the parameters specified by the authors, the synthetic dataset is taken which contains a total of 80 compounds are observed from 6 individuals (4 male and 6 female) for 5 different parameters (Odor Threshold Value, Odor Intensity, Odor persistency, Hedonic Sensations, Odor Characterization) and employed machine learning algorithms to prove the dissimilarity in the body odor components as 81.3% which is used for the application of identification of individuals. From the observations made, the peak height of the compound is considered as one of the parameters.
Definition 3 Peak height (Pk. Ht.) is the ration of vapor pressure to the odor threshold. The peak height gives the altitude of the peak of the particular compound analyzed through the analyzer. Pk:Ht: ¼ V:P: O:Th: where 'VP' is the vapor pressure of that compound at 25°C room temperature, the compound will be in the evaporated state and 'O.Th.' is the odor threshold or the odor concentration of the compound. Generally, under constant temperature and pressure factors the concentration of chemicals that resides in the air is measured in either milligrams/micrograms/nano-grams/pico-grams or parts per million (ppm)/parts per billion (ppb) depend on the molecular weight of the respective compound.
The pattern analysis is adopted to identify the odor components among different individuals to analyze the common components of the humans that served for recognizing the individuals under rubbles. The dimensionality reduction feature of Principal Component Analysis (PCA) plays a critical role in reducing the non-correlated dimensions to correlated dimensions in a statistical way. PCA is a powerful technique that achieves less time and space complexities and stores the information in low memory through dimensionality reduction. Of all these functionalities PCA visualizes different classes of the compounds and their distribution transparently. By reducing the dimensions, no loss of data is remarked such that PC1 contains 57.3% variance and PC2 contributes 42.6% variance and together will contain 99.9% information is preserved. Figure 3 visualizes the PCA data depending on the classes through which the compound belongs to. The common and non-common compounds are separated and transformed to calculate the percentage of missing information. Both the sets retain 99.9% information as preserved and loss of data is negligible. Figure 4 shows the PCA 2-D scattering for common (a) and noncommon compounds (b).
Unlike other classifiers, CMS used random forest classifier which is a meta estimator that usually fits several decision tree classifiers namely -hedonic tone in which the compounds are categorized based on the feel of pleasure of odor (pleasant/neutral/un-pleasant), compounds classified based on the organic group (group1(C2-C10)/group2(C11-C13)/group3(C14-C18)), the class parameter that sorts the compounds according to their chemical nature(Acid/Esters/ Alcohol/Aldehyde/Aromatic-hydrocarbon/Halogens/Others/Ketones/alkanes). Depending on the rankings made by the classifier, it is observed that the score of the classifier with 81.3% accuracy. It is observed that 18.7% of the compounds resides in human body odor has matched. These compounds will help to identify humans under rubbles.

Conclusion
Identification of humans is major tasks for the implementation of Smell printing or odor printing, which is a unique feature where an individual is identified by their body odor. CMS gives the resemblance of gas chromatograph setup for identifying an object which can be defined by its odor. As the human odor is the versatile combination of different VOCs, and each VOC gives different values for each of the odor parameters (Odor Threshold Value, Odor Intensity, Odor persistency, Hedonic Sensations, Odor Characterization) is defined by author. Depending on the synthetic data taken, PCA is applied to analyze the compounds statistically. As each of the odor compounds possesses different dimensionality, CMS achieves a negligible missing of data with 99.9% scattering of the compounds over 2D space. 81.3% of accurate predication score is given by random forest classifier. CMS derives that only 18.7% matching style in the odorants of the individuals which is used to difficult to fork the smell print by the culprits unlike other biometrics.
Author contribution All authors have equally contributed and all authors have read and agreed to the published version of manuscript.
Funding The author(s) received no specific funding for this study.