Retrieving the complete genome and the coding sequence from an Indian isolate
The amino acid sequence of the complete genome of Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) isolate of India (MT050493.1) with 9950 AA was retrieved from NCBI. All the coding regions of the genome was retrieved from the features section containing the coding sequences of orf1ab polyprotein, surface glycoprotein, orf3a protein, envelope protein, membrane glycoprotein, orf6 protein, orf7a protein, orf8 protein, nucleocapsid phosphoprotein andorf10 protein.
Prediction of cytotoxic T cell epitopes for the Indian population
NetCTLpan version 1.1  was used to predict the CTL epitopes across the proteins coded by the SARS- CoV-2 Indian isolate. NetCTLpan uses a neural network to predict TAP-transporter binding and C terminal cleavage predictions in addition to HLA binding prediction. Considering the HLA supertype variation across populations, we predicted the epitopes only for those HLA supertypes which constitute for the majority of human leukocyte antigen (HLA) distribution in the Indian population. The study on the evolution of HLA-A and HLA-B polymorphisms reveals that HLA A3, B7 and B44 are the major HLA’s present in Indian population .
Prediction of epitope immunogenicity
Although the binding affinities of the peptides towards HLA help in predicting the epitopes, the immunogenicity plays an important role in the immune response. All the predicted epitopes were subjected to the Immune Epitope Database (IEDB) immunogenicity tool [5,6] to predict their immunogenicity score. IEDB immunogenicity tool relays on physicochemical properties such as side chains composition, amino acid position to predict the immunogenicity of the peptide sequence.
Identification of unique epitopes
As the human body shows immune response only towards foreign antigens, it is of great importance to consider only those epitopes which are foreign to the human body as a potential vaccine candidate. To filter out the vaccine candidates which are foreign to the human body, all the epitopes that show positive immunogenicity are subjected to Multiple Peptide Match tool  against human reference proteome.
To further confirm the candidacy of the foreign epitopes as a vaccine, the top three foreign epitopes based on immunogenicity scores were subjected to molecular docking studies to confirm their interactions with the specified HLA at the peptide-binding groove. The molecular docking of the peptide epitope with the HLA structure was performed using HPEPDOCK Server . The docking was performed without specifying the binding site residues to investigate if the studied epitopes would bind at the peptide-binding groove without any lead. The interaction diagrams are generated using LigPlot+ .