Wavelet-based multifractal analysis and fractal Signature of SARS-Cov-2 Coronavirus Variants genomes.

Summary: In this paper, the SARS-CoV-2 coronavirus variants of concern and of interest genomes are analyzed using the wavelet transform modulus maxima lines (WTMM) method. The goal is to track the monofractal behavior of the virus genomes and to investigate the Long-Range-Correlation (LRC) character through the estimation of the Hurst exponent. The obtained results demonstrate the multifractal and the anti-correlated characters in the variants of concern for the Knucleotidic and GC DNA coding. The fractal signatures of SARS-CoV-2 coronavirus variants are investigated through the indicator matrix maps of the genomes, they exhibit the same patterns for the variants (Alpha, Delta) and (Eta, Lota, Kappa) with moving positions, while the variants Beta, Gamma and Epsilon have different indicator matrixes. The fractal dimensions of SARS-CoV-2 variants are oscillating around 1.62 , except the Epsilon variant from USA, where the fractal dimension is 1.70.

2 city of the virus but have increased its rate of transmission. In mutated viruses, deletion of nucleotide sequences has been observed relatively in some reading frames extensively. Studies have shown that the host protein induced mutagenesis through interaction via viral proteins. The most important mutation in SARS-CoV2 compared to the original Wuhan virus was the spike D614G mutation and the lineage of B.1.1.7, 20I/501Y.V1 become the dominant and exhibit greater virus spread but did not associate with higher viral loads and morbidity. However, it may affect the effectiveness of the vaccines and mortality rate.
In this research, eight variants of SARS-CoV-2 coronavirus are analyzed using the WTMM method, the indicator matrix, and the fractal dimension of these variants are calculated. We begin the paper by describing the Wavelet Transform Modulus Maxima Lines and the indicator matrix methods.

2-The Wavelet Transform Modulus Maxima Lines
The wavelet transform modulus lines (WTMM) is a multifractal formalism revisited by the Continuous Wavelet Transform, it was developed by Mallat and Hwang in 1992 and used for image processing. The Continuous Wavelet Transform (CWT) is a decomposition of a given signal S(t) into a dilated and translated wavelets * ( * obtained form a mother ( ) that must have n vanishing moments (Grossmann and Morlet, 1985): Where ∈ * 2 , ∈ The first step of WTMM method is to calculate the continuous wavelet transform (CWT) of a given signal and the modulus of CWT, the next step is maxima of the continuous wavelet transform computation. Determination of local maxima is performed using the computation of the first and second derivative of the wavelet coefficients. CWT (a,b) admits a maximum at point A if it satisfies the following two conditions: The function of partition ( , ) is a summation of the modulus of the CWT at local maxima M with a power q ∈ : The spectrum of exponents ( ) is related to the function of partition ( , ) by: The spectrum of exponents ( ) is obtained by a simple linear regression of log ( ( , )) versus log ( ) The spectrum of singularities is the Legendre transform of the spectrum of singularities: For fractional Brownian motions signals which have monofractal character the spectrum of exponents has a linear behavior with equation (Arneodo et al, 1998): So the Hurst exponent is obtained by a simple linear regression of log_ ( )` versus log ( ).
The WTMM has the ability to decide if a given signal is monofractal or multifractal ( Arneodo et al, 1998), if the spectrum of exponents has a linear behavior versus q the signal is monofractal, otherwise it is multifractal.

3-The indicator matrix and the fractal dimension:
The DNA of each organism of a given species is a long sequence of a specific large number of base pairs bp. Each base pair is defined on the 4 elements alphabet of nucleotides (Cattani, 2010): A: adenine , T: thymine, C: cytosine, G: guanine

= { , , , }
A DNA sequence is the finite symbolic sequence, defined by:

S=N*A
S is defined as: = { g } ghi,I,j,k…….m, is the value of x at the position h The 2D indicator function, based on the 1D-definition is the map: The indicator matrix A is defined as: A is a square matrix with dimension N*N Table 1 shows an example of construction of the indicator matrix.
. . Table 1: The indicator matrix components From the indicator matrix we can have an idea of the "fractal-like" distribution of nucleotides.
The fractal dimension for the graphical representation of the indicator matrix plots can be computed as the average of the number p(n) of "1" in the randomly taken minors of the correlation matrix g,z (Cattani, 2010 ;Ouadfeul, 2020) :

4.1-Variants of Concern (VOC):
A SARS-CoV-2 variant that has been demonstrated to be associated with one or more of the following changes at a degree of global public health significance:

4.2-Variants of Interest (VOI)
It is SARS-CoV-2 variant with: -With genetic changes that are predicted or known to affect virus characteristics such as transmissibility, disease severity, immune escape, diagnostic or therapeutic escape; AND -Identified to cause significant community transmission or multiple COVID-19 clusters, in multiple countries with increasing relative prevalence alongside increasing number of cases over time, or other apparent epidemiological impacts to suggest an emerging risk to global public health .For more information about SARS-CoV-2 Variants, we invite readers to visit the SIB Swiss Institute of Bioinformatics web site https://viralzone.expasy.org/9556.

5-Wavelet-based multifractal analysis of SARS-CoV-2 variants:
A sample of DNA sequence of each variant of the SARS-CoV-2 virus is analyzed using the WTMM method, the goal is check the monofractal behavior and to investigate the LRC character.  (2012); MF denotes the multifractal behavior. We observe that the LRC is dominant in purine and pyrimidine DNA coding of the eight SARS-CoV-2 variants, the anticorrelated and multifractal behaviors are observed in the variants of concern for the Knucleotidic DNA coding. This character is also observed in Ameno coding of the Gamma and Delta variants and in the GC coding of the Gamma variant. These results are compared with those of genomes analysis using the WTMM method shown in table 04 illustrated in the paper of Ouadfeul (2020a).
We can see that the virus genomes lose the long range correlation character and their structures become more complex and less self-organized with time.

6-Fractal Signature of some SARS-CoV-2 variants:
The same eight sequences of Alpha, Beta, Gamma, Delta, Epsilon, Eta, Lota and Kappa variants are analyzed using the indicator matrix and fractal dimension. Figure