Hepatitis B Virus Covalently Closed Circular DNA Interacting HBx and HBc Proteins Global Conservancy Analysis and Related B or T lymphocyte Epitopes

Background: The Hepatitis B Virus HBx and HBc proteins associate with covalently closed circular DNA, which is the main reason for intrahepatic viral persistence and major cause due to which HBV cure has not been achieved yet. The aims of present study were to generate HBV genotype-specic consensus sequences of HBx and HBc, align all ten (A to J) consensus sequences to develop global consensus sequences of HBx and HBc, analyze variable and conserved motifs, and to predict highly conserved B and T cell binding epitopes in HBx and HBc proteins, respectively. Methods: 237 and 207 sequences, belonging to all reported globally, were aligned in CLC main workbench to draw global consensus sequences and phylogenetic analysis was performed. The location of possible B cell epitopes were analyzed using immune Epitope Database (IEDB), while possible T cell epitopes were analyzed using ProPred-I and ProPred in-silico prediction tools. Results: could be critical for peptide vaccine.


Introduction
Hepatitis B virus (HBV) is a partiallydouble stranded DNA virus belonged to the Hepadnaviridae family, which has exclusive tropism for hepatocytes (1). HBV infection causes acute hepatitis, which can lead to chronic hepatitis B (CHB), liver brosis, fulminant hepatic failure, cirrhosis, and hepatocellular carcinoma(HCC) (1). Globally, 257 million people are living with CHB resulting in 887,000 deaths per year (2).
HBV has 10 major genotypes (A,B,C,D,E,F,G,H,I and J) with diverse geographical distribution (3). HBV genotype A is predominant in Northern Europe, Sub-Saharan and Western Africa, and India; genotype B is widespread in East Asia, China, Japan, Indonesia, Vietnam, Taiwan, Philippines, Greenland, Northern Canada and Alaska; genotype C is primarily observed inSoutheast Asia, China, South Korea, Australia, Taiwan, Indonesia, Vietnam and Philippines; genotype D is common in Mediterranean countries, Africa, Europe, India, and Indonesia; genotype E is dominant in Western Africa; genotype F is primarily observed in Central and South America; genotype G is widespread in United States, Germany, and France; genotype H is dominant in Central America; genotype I is prevalent in Laos and Vietnam; and genotype J has been reported dominant in Ryukyu Islands of Japan (3).
Despite of the availability of anti-HBV vaccines, the viral hepatitis has still remained a major public health problem (2). Reverse transcriptase inhibitors such as tenofovir, lamivudine and entecavir are commercially available for treatment of HBV patients,however these have to be taken for life (4).
Currently there is no cure of HBV (5). A true cure of HBV requires clearance of intra-hepatic nuclear covalently closed circular DNA (cccDNA) which is the key to virological persistence (5,6). During HBV life cycle, the cccDNA associate with several cellular histone and non-histone proteins, including HBc and HBx, transcription factors co-activators, and several epigenetic activators and repressors which affect HBV transcription and epigenetic control (7)(8)(9).
HBx is a 154-amino acid long protein, which can stimulate viral replication to several folds and essential for initiating and maintaining HBV life cycle, host-virus interactions, and development of HCC (9,10). In the absence of HBx, the cccDNA rapidly attain silent state (closed con rmation) and becomes transcriptionally inactive (6). Short interfering RNA (siRNA) mediated HBx gene silencing inhibited HBV gene expression and replication in HBV genotype B hydrodynamic injection mouse models (11). Recently a new study has shown that cell penetrating antibody targeting HBx in chronic HBV infection mimicking mouse models or cells signi cantly suppressed hepatitis B virus (12).
The HBc is 183 or 185 amino acids long protein (length depending on the strains), which facilitates formation of nuclear capsid, capable of incorporating pre-genomic (pgRNA) and reverse transcriptase. It facilitate conversion of single-stranded (SS) DNA to the relaxed circular (RC) DNA and generate mature NC, which could be subsequently enveloped through envelop proteins and released extracellularly as infectious virion (13)(14)(15). Both HBx and HBc are crucial for HBV replication. Targeting cccDNA binding viral HBc and HBx proteins could limit viral replication. Thus the aim of the present study was to design global consensus sequences of HBx and HBc proteins and to analyze variable regions and highly conserved motifs which could offer potent target site for development of peptide vaccine, designing sitespeci c RNA interference and anti-HBV inhibitors. Furthermore we investigated B or T cell epitopes, in HBx and HBc proteins using in-silico approaches, which might be capable of generating neutralizing antibodies against all A-J genotypes of HBV.

Material And Methods
Retrieval of HBx and HBc sequences and development of global consensus sequences.
A total of 237 HBx and 207 HBc sequences belonging to all major 10 genotypes of HBV were randomly retrieved from the National Center for Biotechnology Information database (https://www.ncbi.nlm.nih.gov/nuccore/). These sequences were reported from all over the world including USA, Canada, Brazil, Venezuela, Argentina, France, Germany, Netherlands, Mexico, Belgium, China, South Korea, Japan, Indonesia, Thailand, Viet Nam, Turkey, Pakistan, India, South Africa, Ghana, Côte d'Ivoire, Nicaragua, Mauritius and Martinique. The HBx and HBc amino acid sequences were fed into CLC Main Workbench v. 8 (Qiagen GmbH, Hilden, Germany). Using multiple sequence analysis feature of the software, we constructed consensus sequences of each genotype (A-J). The consensus sequences of all ten genotypes thus constructed were subsequently aligned to obtain global consensus sequences of HBx and HBc.

Peptide designing and phylogenetic analysis
The global consensus sequences were analyzed to nd highly conserved regions and variable residues indifferent domains and motifs of HBx and HBc proteins. Short stretches of amino acids from the highly conserved regions of HBx and HBc proteins were selected from the global consensus sequence alignments; these peptides could be better targets for potential peptide-based vaccines testing and useful for designing inhibitory compounds. To draw phylogenetic trees all 237 HBx and 207 HBc sequences were aligned in CLC Main Workbench v. 8 (Qiagen GmbH, Hilden, Germany) and subjected to unweighted pair group method with arithmetic mean (UPGMA) with boot strap value of 100.
B-and T-lymphocyte epitopes prediction.
The location of B and T cell epitopes were mapped in the global consensus sequences of HBx and HBc.

Results
Global consensus sequence of HBx and HBc proteins and analysis of highly conserved regions.
HBV covalently closed circular DNA (cccDNA), associates with cellular and viral DNA binding proteins, HBc (via direct binding) and HBx (through indirect binding) to form viral episome that serve as template of viral transcription and protein expression in HBV-infected hepatocyte nuclei (8, 9). HBx promotes transcriptionally active state of viral episome and enhance HBV replication to several folds; while HBc performs crucial roles in structural organization of episome by altering nucleosome numbers and leads to higher episome copy numbers. Attacking cccDNA binding viral proteins (HBc and HBx) could be important antiviral approach to limit HBV replication.
We hypothesized that conserved residues of HBx and HBc might be important in developing novel anti-HBV agents, designing peptide-based vaccines and site-speci c inhibitors. Therefore, to analyze highly conserved domains of HBx and HBc we constructedconsensus sequence of all ten major genotypes of HBV reported globally, and aligned the genotypic consensus sequences to develop global consensus sequences of HBx and HBc, respectively. Figure 1 shows alignment of consensus sequences of HBx from all ten HBV (A-J) genotypes; the global consensus sequence is shown at the base. Different motifs and domains of HBx were analyzed for amino acid conservancy and variability (Fig. 1A). The graphs represents percentage conservancy of HBx amino acids. Conserved residues were labeled by their symbols, however variable amino acids were denoted by symbol "X". Stretches of highly conserved amino acids could serve as peptide vaccine (16,17).Therefore, small peptide fragments (Table 1) were deduced from highly conserved regions of HBx global consensus sequence ( Fig. 1A and B). A phylogenetic tree of 237 HBx sequences belonging to A-J genotypes, reported from different countries across the world was constructed (Fig. 2). The sequences from various genotypes were clustered together on the basis of evolutionary relatedness.  3A), and analyzed various domains and motifs for conservancy or variability of amino acids. The graphs depict percentage conservancy of HBc amino acids. Highly conserved residues are shown by their symbols, while variable regions are labeled by "X".
Short peptides were selected ( Table 2) from highly conserved regions of HBc global consensus sequence ( Fig. 2A and B), which might offer potent target site for development of peptide vaccine or designing sitespeci c anti-HBV inhibitors. The phylogenetic tree of 207 HBc sequences from 10 genotypes, from all over the world was constructed (Fig. 4), which indicate evolutionary relatedness between the sequences. Global consensus sequences of HBx and HBc proteins were subjected to analysis in IEDB for prediction of different B-cell epitopes which might produce neutralizing antibodies against all HBV genotypes. Each HBx related B-cell epitope was given a distinctive name, from X-B1 to X-B7 (Table 3), while HBc related Bcell epitopes were denoted as C-B1 to C-B7 (Table 4). The predicted epitopes were subjected to epitope conservation analysis via IEDB conservation analysis resource. The location, length and percentage conservancy of each epitope of HBx and HBc were refracted in Tables 3 and 4, respectively. Among HBx related B-cell epitopes, X-B2 (DVLCLRP) and X-B4 (AGPCALR) were considered to be conserved in all ten genotypes (Table 3).However among HBc related B-cell epitopes, C-B6 (IRQLLWFHISCLTF) and C-B7 (VLEYLVSFGV) were considered highly conserved in all HBV genotypes (Table 4). We utilized miPep Base facility to analyze weather these peptide trigger autoimmunity. And none of the peptides triggered autoimmune response. Aforementioned epitopes, predicted from global consensus sequences, might be capable of generating strong neutralizing antibodies against all (A-J) genotypes of HBV. T-lymphocyte MHC-I and MHC-II speci c epitopes of HBx and HBc proteins and conservation analysis.
The location of possible T-cell MHC-I epitopes was identi ed in global consensus sequence of HBx and HBc through ProPred-I, for 47 MHC-I alleles. In total, 323 different HBx related MHC-I epitopes were predicted, which were subsequently subjected to conservation analysis via IEDB epitope conservation analysis. We selected epitopes with 70-100% conservancy (Table 5). Among these epitopes, X-M2, X-M5, X-M8, X-M11, X-M12, X-M20, X-M22, X-M25, X-M27 and X-M32 were conserved in HBV HBx consensus sequence and among all ten genotypes of HBV. Similarly, 182 different HBc related MHC-I epitopes were predicted and subjected for epitope conservation analysis. Epitopes with 70-100% conservancy were selected and presented in Table 6.  X-M32* AGPCALRFT 9 100  To identify the possible location of T-cell MHC-II epitopes in global consensus sequence of HBx and HBc we utilized Propred in-silico analysis facility, for 51 HLA-DR alleles. In total 204 HBx related MHC-II epitopes were predicted, which were subjected for IEDB epitope conservancy analysis. We selected epitopes having 70-100% conservancy, as shown in Table 7. The X-T4, X-T6 and X-T8 were conserved in HBx consensus sequence and all HBV genotypes. Similarly, 203 different HBc related MHC-II epitopes were predicted and analyzed for epitope conservancy as described above. Epitopes with 70-100% conservancy are presented in Table 8. Among these, the C-T1-3, C-T9, C-T10, C-T12, C-T14 and C-T16-18 epitopes were identi ed to be 100% conserved in HBc consensus sequence and among all (A-J) genotypes of HBV. Using miPepBase,we found that none of the peptides trigger autoimmune response.

Discussion
The HBx protein contains regulatory (or negative regulatory) and transactivation (orcoactivation) domains (Fig. 1B). The regulatory domain (1 to 50 aa) is dispensable for HBx activity and represses HBx transactivation activity (18). Consensus sequence analysis of regulatory domain shows that region 1 M to 20 P is highly conserved, while the region 31 S to 40 P is variable, except 32 G and 35 G residues which are highly conserved among all HBV genotypes.
The transactivation domain (51 to 154 aa) is essential for augmentation effects on HBV transcription and replication (19). The region 52 H to 65 S amino acids is critical for augmentation effect in HBV replication (19). Consensus analysis shows that this HLSLRGLPVCAFSSmotif is highly conserved among all HBV genotypes. Deletion of 141 L to 154 A (last 14) amino acids of transactivation domain did not affect transactivation property (20). It has been demonstrated that 132 F to 140 K and 137 C residues are crucial for transactivation of HBx (20). The consensus sequence analysis shows that this FVLGGCRHK motif and 137 C residue are completely conserved among all HBV genotypes. It could be inferred that designing anti-HBV siRNA or inhibitor of these region might target HBx from all HBV genotypes. However a natural mutant of HBx (HBxDelta127) without FVLGGCRHK motif and 137 C, has been reported from chronichepatitis B, liver cirrhosis and HCC patients, which can induce growth and proliferation of hepatoma cells (21,22). The HBx protein, due to presence of BH3-like motif ( 110 A to 135 G) in the Cterminal region, directly associates with anti-apoptotic Bcl-2 family proteins and induce elevated cytosolic calcium levels and promote viral DNA replication (23)(24)(25). Consensus sequence analysis showed 111 Y, 113 K to 115 C, 117 F, 120 W to 122 E, and 132 F to 135 G residues were completely conserved in BH3-like motif of HBx in all HBV genotypes. Among 13 XaaP motifs in HBx protein, 10 D 11 P, 19 R 20 P, 28 R 29 P, 45 (Fig. 3B). The N-terminal domain (NTD) is critical and su cient for capsid assembly (27,28). Consensus sequence analysis of CTD shows that region 1 M to 11 A of all HBV genotypesis completely conserved, except genotype G that contains additional 12 amino acids RTTLPYGLFGLD insertion. This insertion has pleiotropic effects on core protein expression, HBV replication, and virion secretion (29). Insertion of RTTLPYGLFGLD motif enhanced core protein levels independent of viral genotype, augments replication in genotype G, while impairs replication in genotype A and D (Gutelius et al., 2011).The region 13 V to 39 R (or 25 V to 51 R of genotype G) is highly conserved among all HBV genotypes. The NTD carries proteaselike sequence 30 L to 35 S (or 42 L to 47 Sof genotype G) which resembles to retroviral proteases (30). Consensuses sequence analysis shows that this LLDTAS motif is highly conserved among all HBV genotypes. The region between 74 to 101 amino acids is considered as hypervariable which might lead to development of liver injury (31,32). Consensuses sequence analysis also shows that 74 X and 179 X residues are most variable among all HBV genotypes.
The linker domain STLPETTVV can interfere with NTD, pgRNA packaging in sequence-independent manner, viral DNA synthesis in sequence independent manner (during rst step of reverse transcription to initiate single strand DNA) and in sequence dependent manner (during second step of reverse transcription that is extensive plus strand DNA synthesis to generate relaxed circular DNA), and virion secretion in sequence dependent manner (26). Presence of only ve amino acids ETTVV in linker region were su cient to generate single stranded DNA synthesis (26). The consensus sequence analysis indicates that linker STLPETTVV region is completely conserved among all HBV genotypes, except genotype E which contains STLPENTVV. The four cysteines residues at position 48, 61, 107 and 183 are not essential for core particle formation, however these residues can further stabilize HBV core particles or HBc dimers (33). The consensus sequence analysis shows that all of these cysteine residues are 100% conserved among all HBV genotypes. The 132 Y, 127 R, 129 P, and 139 I are critical for HBc dimer formation (34). Consensus sequence analysis indicate that these residues are completely conserved among all HBV genotypes. The amino acid regions between 98 R to 115 V and 117 E to 145 E are 100% conserved among all HBV genotypes, indicating potential of this region for the HBV life cycle and/or viral pathogenesis.
The C-terminal domain (CTD) contains highly basic residues (arginine rich, protamine-like) that resemble to histone tails, which are critical for non-speci c nucleic acid binding (35,36). The CTD is dispensable for capsid assembly and functionally plays important role in pgRNA packaging and reverse transcription (37)(38)(39). The CTD phosphorylation is important for speci c viral RNA packaging (40)(41)(42) Recent ndings have implicated active roles of HBc and HBx in epigenetic regulation of viral-host interplay (9,48). The multiplicity of HBx and HBc functions and their capacity to in uence cccDNA minichromosome for enhanced viral replication, elevates these proteins as excellent targets for antiviral therapeutics. (9,48). The HBx and HBc consensus sequences were used to predict highly conserved B and T cell binding epitopes. Several highly or semi-conserved B cell binding epitopes were predicted, we selected highly conserved epitopes with 70-100% conservation among all ten HBV genotypes. Similarly, several MHC-I or -II related epitopes exhibited maximum allele-binding a nity indicating as possible T-cell related epitopes. Among HBx related B-cell binding epitopes, due to complete (100%) conservancy among all HBV genotypes, the X-B2 and X-B4 epitopes mightbe consider as better targets for B-cell based vaccine development. Similarly among HBc related B-cell binding epitopes, due to high (80%) conservancy among all HBV genotypes, the C-B6 and C-B7 epitopes might be better targets for B-cell based vaccine development. On the other hand, among HBx related MHC-I speci c epitopes the X-M2, X-M5, X-M8, X-M11, X-M12, X-M20, X-M22, X-M25, X-M27 and X-M32; whileamong MHC-II speci c epitopes, the X-T4, X-T6 and X-T8 could be adopted for synthetic vaccine against multi-genotypes of HBV. Similarly for HBc related MHC-I speci c epitopes the C-M1, C-M2,C-M4, C-M6-11, C-M13,C-M19, C-M24-26,C-M30, C-M34-36,C-M40 and C-M43-45; while among MHC-II speci c epitopes, the C-T1-3, C-T9, C-T10, C-T12, C-T14 and C-T16-18 epitopes could be ideal epitopes with high conservancy across all HBV genotypes. The use of conserved epitopes predicted against NS3-4A from global consensus sequences could provide broader protection against multi-isotypes of hepatitis C virus (49). Our study suggests conserved epitopes against HBx and HBc global consensus sequences that may be invoked as potential targets for development of effective vaccine candidates and conserved residues could also be attributed for designing novel site speci c anti-HBV agents which can target all major genotypes of HBV. Though present study indicates Bcell or T-cell related antigens on the basis of in-silico analysis, the antigenic potential of aforementioned peptides should be further characterized in HBV infection animal models.

Conclusion
HBx and HBc bind to HBV cccDNA, which is the main reason for intrahepatic viral persistence. HBx The study has been approved by ethical review board of Islamabad Diagnostic Center Pakistan.
Humans/animals were not directly involved in the study.

Consent to publication:
All authors approved the submission of the manuscript for publication Figure 1 Sequence alignment of HBV genotype speci c consensus sequences of the HBx protein and global consensus sequence is shown.

Figure 2
Phylogenetic tree of 237 HBV HBx sequences from all ten genotypes reported across the world Figure 3 Sequence analysis of the genotypes A-J of the HBc protein and global consensus sequence is shown. Figure 4 Phylogenetic tree of 207 HBc sequences from 10 genotypes of HBV reported globally.