To investigate an over view of genome variation in cancer 16 samples of whole exome of various cancer types has been downloaded (Table 1). The cancer group contains squamosal head and neck, hepatocellular carcinoma, acute myeloid leukemia, lymphoma, NK Malignancy, pleura lung cancer, gallbladder adenocarcinoma cancer Type. Among them head and neck squamosal cancer as more stable cancer type and blood cancer as the most instable cancer type has been choose as well to validate our analysis. Comparison among cancer and non-cancerous samples have been done. It is notable that non-cancerous samples include normal samples in addition to normal adjust cancer tissues. Moreover, our analysis has been shown that number of occurred variation in cancers are related to their type. In comparison with normal samples which their incidence is more concentrate and predictable, cancer show a high degree of divergence with highest number in blood related Cancers and lowest in cancer (Fig. 1A). After this step analysis of chromosomes of each sample seems crucial. In this essence analysis of chromosomal of cancer versus non-cancerous samples have been done. Although there was no significant result in Y chromosomes, X chromosome on the other hand shows more instability in control samples (Fig. 1B). The same situation happen with other chromosomes in these comparisons. The average of all of chromosomes of normal samples from 1 to 22, with no expectation, were more instable and significantly differ (P value < 0.0001) (Fig. 1C). To investigate the stability of all chromosomes of each sample, we draw violin plat of each sample based on their stability of their chromosomes (Fig. 1D). Once again stability of the normal group were significantly lower than cancer group (P value < 0.05). Although the two control groups of cancer such as squamosal of head and neck and blood related cancers show the most stable and instable type in all of the samples subsequently.
To understand the underlying mechanism of this variation, analysis of their types, functions and other features were important. These variations have been occurred as SNP frequently in both cancerous and non-cancerous type. It is interesting MNP occur more in cancer type and SNP slightly higher in normal cells. Other type of variation including insertion, deletion and mix changes were significantly lower that other two types. It is interesting that mix type has been occurred in cancer types more occasionally (Fig. 2A). The most interesting data of ours are related to the position of variation in genes. More than 60 percent of variation occur on exon and intron in both groups however interionic variation are significantly higher in cancer group than normal one (P value = 0.0082). In opposite changes in exonal segments are less in cancer group than normal type (P value = 0.0033) (Fig. 2B). Other group have less share in these variations except intragenic segment which cancer cell show more instability. The impact of variation has been classified to 4 groups containing the modifier, moderate, high and low. Each group represented in supplementary table1. Modifier changes affect more cancer group (P value = 0.0033) (Fig. 2C). Interestingly high changes were fewer in cancer groups (P value = 0.0086). Percentage of missense were slightly lower in cancer group versus the control group. Instead (P value = 0.0088), variation result in silencing of protein were slightly higher in cancer group (P value = 0.0411) (Fig. 2D).
To obtain deeper insights, we sort genes base on their variation index as high impact and determine their functional pathways. Genes have been categorized and classified. It has been found that more than 100 genes with high impact changes has been belonged to metabolic pathways. Further pathways such as signaling, proteoglycans and viral carcinogenesis were also significant as it has been shown (Fig. 3A). More over a list of genes whom their present was dominated in all of the cancerous samples has been prepared, and a network of them has been prepared (Table2 and Fig. 3B).
Table 2
Genes most appear in cancer group with high impact effects
CHI3L1 | WARS | PTPRB | ALDH4A1 | STAG2 | TLR8 | CHI3L2 |
FOLH1 | LDHA | FASN | CYP11B2 | XYLB | TP53 | ADAM17 |
NMRK1 | NOTUM | IDUA | PCSK9 | TP73 | MEN1 | PTK2 |
CES1 | ADSL | BLVRB | EGFR | PPIA | LGALS8 | MAPKAPK2 |
SYK | UBC | NEU2 | PNP | IDE | DPP4 | CBS |
HDAC8 | GAPDH | FDFT1 | EPHX2 | RAB8A | OAT | FURIN |
CSF1R | PKM | NAGA | PDE4B | CHIA | PPP5C | KDR |
HRAS | PTGR1 | CSNK1D | GRHPR | MAOB | TYMP | PFKP |
Among candidates, TP53 were the key to cancer. Two significant pathways also has been link to TP53. First were metabolic pathway including FASIN and LDHA which could regulate energy pathway. Other was MEN1, a tumor suppressor (Fig. 3B and C).
In the next step, we were wondering if the sample could be classified based on their variation on their chromosomes. Furthermore, an over view of each sample based on their chromosomes also seems curial. In this essence, their heatmap has been set. It is interesting that heatmap could categorize most of the samples in their groups. In addition, most of the normal and cancerous groups show the same signature of each other (Fig. 4A). Furthermore, we perform a ROC cure analysis to see if this method could predict cancerous cells efficiently. The P-Value of Roc curve were 0.1742 with area of 0.6667 (Fig. 4B).