Epidemiological analysis of Lung cancer in Erbil province of Iraqi Kurdistan: Incidence, Survival, Relative Risk Ratio, and Treatment Regimes in males and females

This study compares survival function, relative risk, incident rates and treatment regime between genders. A total of 590 cases of Lung Cancer admitted to Nanakali hospital, Erbil province of Iraqi Kurdistan, were collected for 5 years period between 1st January 2013 to 31st of December 2017. The follow-up of the cases continued till the 1st of April 2018 to complete the record. Chi-square, correlation, relative risks and basic exploratory data analysis were carried out. Simple linear regression was carries out for number of lung cancer among males and females. A multivariate Cox-regression model was used to determine the prognostic factors for lung cancer patients. Pearson’s Correlation Coecient (r) for total cases of Lung cancer (Males and Females) was equal to (0.875 ± 0.033 with P= 0.044) and R-square (Precision) of 0.766. The Prediction Regression equation that Female Lung Cancer (F) = 11.79 + 0.6714 Age group. This means any age group can be selected to predict for expected incidence. The prediction equation is that Male Lung Cancer cases = - 2.857 + 0.7690 Age group. The Regression coecient is + 0.769 per 10 years of age and it is highly signicant (P=0001). The result of multivariable cox regression model indicates that gender had no inuence on survival outcome (HR ~= 0.81, 95%CI: 0.56 to 1.16, p=0.0.247). However, taking surgery and immune system are statistically signicant prognostic factors for lung cancer patients. The model indicated that the risk of mortality increases by 92% if lung cancer patients do not take surgery (HR ~= 1.92, 95%CI: 0.31 to 0.97, p=0.039). Furthermore, the risk of mortality is reduced by 44% among those patients who took immune system. The study concludes that female patients survive longer than males and median survival probability in female is greater than male lung cancer patients. Taking surgery and immune system are statistically signicant prognostic factors for lung cancer patients.


Introduction
Cancer is a wide term that encompasses more than 270 distinct kinds of cancer illness. Several stages of cancer have been discovered, suggesting that many gene mutations are involved in cancer ethology. The aberrant cell growth is caused by these gene alterations. The increase of cell proliferation is aided by genetic abnormalities induced by heritance or hereditary factors. Extra data has been acquired with the help of technical advancements in bioinformatics and molecular methods, which may be helpful for early diagnosis and appropriate therapy, [Fisher et. al. 2016, Aizawa et al. 2016, Poon et al. 2014]. The effects of medicines on cancer patients may be predicted and even managed in certain cases. Molecular genetic research has discovered cancer pathways in recent years. The ndings of these research contributed to a better understanding of the role of genetic abnormalities in the development of cancer. The goal of this research was to look at the molecular features of cancer, [Antwi et al. 2015, Shtivelman et al. 1985, Cigudosa et al. 1999].
Cancer is a broad term for a variety of illnesses de ned by the uncontrolled division of aberrant cells with the potential to in ltrate and destroy normal body tissue. Cancer has a high proclivity for spreading throughout the body. Changes (mutations) in the DNA of cells are the cause of cancer. Each gene in a cell includes a collection of instructions that teach the cell what tasks to execute as well as how to grow and divide, [Wood et al. 2001, Alvarez-Buylla et al. 2008, Goelz et al. 1985. Errors in the instructions may cause a cell to cease functioning normally and even lead it to become malignant. The vast majority of malignancies strike individuals with no recognized risk factors. Age, lifestyle, family history, health problems, and the environment are all known to raise your cancer risk, [Fraga et al. 2005, King et al. 1985].
Cancer is a leading cause of illness and death globally, affecting both men and women of all ages.
According to the World Health Organization (WHO), an estimated 10 million fatalities would occur worldwide in 2020, with cancer accounting for one out of every six deaths (excluding deaths caused by con icts). Also, it was estimated that about 70 percent of Cancers occur in law-and middle-income countries. Tobacco is the most signi cant risk factor for cancer, and lung cancer is the most frequent disease in males (16.7 percent), while breast cancer is the most common cancer in women (34.2 per 100,000), [Heinrich et al. 2002, Thomas et al 2007, Roninson et al. 2002].
The lungs are two sponge-like organs in the human body. The lobes of your right lung are divided into three parts. There are two lobes in your left lung. Because the heart occupies more space on the left side of the body, the left lung is smaller. When you breathe in, air enters via your mouth or nose and travels down the trachea to your lungs (windpipe). The trachea is divided into bronchi tubes, which enter the lungs and split into smaller bronchi tubes. These split into bronchioles, which are smaller branches.
Alveoli are small air sacs that sit at the end of the bronchioles, [Muller et al. 2014, Esteller et al. 2007, Doi et al. 2009].
When you inhale, the alveoli take oxygen into your blood and through the exhalation they eliminate CO 2 .
The primary tasks of your lungs are to take in oxygen and expel carbon dioxide. Lung malignancies begin in the cells that line the bronchi and other regions of the lung, such as the bronchioles and alveoli. The pleura is a thin lining layer that surrounds the lungs. Your lungs are protected by the pleura, which allows them to move back and forth against the lung wall as they expand and relax throughout respiration, [Espada et al. 2007, Dalgliesh et al. 2010, Noonan et al. 2009, Mulero-Navarro et al. 2008].
the diaphragm divides the Lung from the abdomen and it travels up and down as you breathe and pushes air into and out of the lungs, Figure 1.
The cells in our chest as well as other areas of our bodies normally undergo a growth and death cycle that keeps the number of cells in control. Any kind of cancer arises when a series of particular changes, known as mutations, occur in a previously healthy cell. Uncontrolled cell division may result in an excessive number of cells when a collection of mutations alters genes in ways that disrupt the normal development and death cycles of cells. When the gas pedal becomes stuck or the brakes fail, the cells continue to divide with nothing to stop them. A tumour, neoplasm, or lesion is a mass formed by mutated and abnormally growing cells. On a Lung X-ray or CT scan, this tumour may be identi ed as a nodule that is either malignant (cancer) or benign (non-cancerous), [Sporn et al. 2009, Thun et al. 2008, Proctor et al. 2012]. The mass may be benign or cancerous, as seen in Figure 2.
The tumour is deemed malignant when the tumour cells are able to in ltrate normal tissues. Lung cancer is de ned as a tumour in which the malignant cells originated in the lungs. Metastasis is the spread of cancer from one area of the body to another, and metastases are the tumours produced by the cancer cells that have spread. Lung cancer metastases may spread to lymph nodes surrounding the lungs, as well as to other organs such as the bones, adrenal glands, and the brain, through the circulation. Cancer may begin in other areas of the body and then move to the lungs, [Osmani et al. 2018, Horn et al. 2008, Saunders et al. 1997, Miller et al. 1969]. This is referred to as metastasis of the initial malignancy, not lung cancer. Lung cancer is de ned as cancer that begins in the lungs, Figure 3.
Beside the wide distribution of cancer incidence overall the world, Kurdistan Region in Iraq has been exposed to several carcinogenic hazards. Although, the few reports about the increased risk of cancer in different cities in Iraq were present during the recent years, but these reports did not cover Kurdistan region. This is while that cancer incidence rate and possible risks of cancer in this region is considerably increasing. Among the cities of Kurdistan region (KR), Erbil province covers most of the KR population and cancer incidence at diverse types. Through this study, we focused on the epidemiological analysis of Lung cancer as one of the most type of cancer incidence in Erbil province according the reports of hospitals and Erbil cancer society and investigated its incidence, survival, relative risk ratio, and treatment regimes in males and females through the study of a wide spectrum of the people with lung cancer, Through the current study, the results revealing some more useful indicators and statistical diagnostic tools in the ght against cancer. It is also a retrospective study which analysed cancer registry data in Erbil from 2013 to 2017.

Data collection
The data used in the manuscript as indicated in supporting information were taken from the Department of Lung cancer, Nanakali hospital, Erbil-Iraq, (https://www.facebook.com/NANAKALI-HOSPITAL-1443163382588149/). These data include 590 cases of Lung Cancer admitted to Nanakali hospital collected for 5 years period between 1st January 2013 to 31st of December 2017.

Method
A total of 590 cases of Lung Cancer admitted to Nanakali hospital, mostly from Erbil province, Kurdistan, were collected for 5 years period between 1st January 2013 to 31st of December 2017. The record also contains few lungs cancer cases from Baghdad, Diyala and Anbar living in Erbil. The follow-up of the cases continued till the 1st of April 2018 to complete the record. 462 cases died (D) and 128 cases remained alive (A) up to the end of the follow-up date bringing death rate to 85% of all lung cancer cases in this study. Ethical approvals are granted by the hospital, to use the record, without names of patients.
Age at diagnosis of positive lung cancer is recorded in years, and survival period is calculated in months.
The analysed record also includes treatment regime in each case, in males and females, and the overall outcome, dead or alive, is also recorded for both sexes.
Treatment regime followed the main worldwide factors to treat cancer cases, which are Surgery (S), Chemotherapy (C), Radiotherapy (R), Immunotherapy (I), Hormone (H) and combinations of two or more of mentioned factors, Scheme 1.
All statistical analysis is carried using STATA software version 12 and include: Descriptive statistics of all parameters, age group distribution of Erbil population (2017) for every 10 years (10, 20, 30, 40, 50, 60, 70, 80+) as in Table 1 with percentages for males and females, age distribution of lung cancer patients (Table 2), Anderson-Darling test of normality and of each age group (every ten years) and Percentiles and Probability distribution of Survival data with Anderson-Darling test of normality.
Measures of association of occurrence of Lung cancer cases with age groups of patients, for males, females and the total, were calculated and presented in terms of Pearson's Correlation Coe cient (r) and Regression analysis and Regression Coe cients (b). The Prediction equations for the occurrence of Lung cancer at any age was also calculated. This will help health planners in predicting Lung cancer frequency at any age.
Relative Risk Ratio (RRR) ratio is also calculated, for females/males, at different age groups as probability of occurrence of Lung cancer in females over the probability of occurrence in males.
The Chi-Square test of signi cance is performed to compare the relationship between study variables in males and females. The chi-square compares Observed (O) with Expected (E) frequencies and a signi cant level of 5% is used.

Age groups distribution of Erbil population in 2017
The distribution percentage of Erbil population of 2113391 in 2017 by age group of tens and gender was calculated and shown in Table 1. The gures in this table are used for the calculation of measures of association (correlations and regressions). The age of about 25% of the population of females and males falls within the age group up to 20 years and that less than 7% falls over 50 years. This relation partly explains why the relative higher numbers of Lung Cancer cases occurred in the over 60 age groups.  where the number of recorded cases fell to only 90 cases in total. This might be an error of reporting particularly on the female side where the over 80 female lung cancer cases dropped to only 32 cases compared to 66 cases in age group 70. Correlations Coe cient, in both males and females are positive and con rm the fact that cancer incidence grows with age and covers all ages from birth to death.

Regression coe cient of number of lung cancer cases (Y) on age groups (X)
To quantify the relation between Lung cancer cases and age groups, Table 2, a Regression analysis and calculation of the Regression Coe cients (b) of Y (number of cases) and X (age groups) for both males and females separately and for the total cases was conducted. The Prediction Equation was also calculated which enables planners to estimate the expected incidence of Lung cancer at any age. This, of course, may be added to the jigsaw of diagnosis process.

Regression of female lung cancer cases on age
Regression Analysis of Female cases on Age is conducted and is shown below and in Figure 5. The Prediction Regression equation that Female Lung Cancer (F) = 11.79 + 0.6714 Age group. This means any age group can be selected to predict for expected incidence. Female Regression Coe cient of number of Lung cancer cases and age in years equals + 0.6714 case/10 years which is marginally signi cant at P= 0.085. The reason is as already mentioned when calculating the Correlation coe cient for females. This means that Lung cancer in females is on rise with years, Table 3.

Regression Analysis of Males Lung cancer cases on age
Association as Regression coe cient and the Prediction Equation for males Lung cancer on age is calculated and shown in the ANOVA table (Table 4) and Figure 6. The prediction equation is that Male Lung Cancer cases = -2.857 + 0.7690 Age group. The Regression coe cient is + 0.769 per 10 years of age and it is highly signi cant (P=0001). between the occurrence of Lung Cancer and age we can predict the level of incidence will increase by 1.44 every ten years. Figure 7 shows Linear regression line of total Lung cancer cases on age groups and the prediction equation.

Exploratory analysis
Basic exploratory of the data was performed to evaluate the covariate distribution in uncensored/ censored patient and in each ethnic group. It can be noticed that 42.4% of censored patients were male while 57.6% of censored were female patients. Table 5 indicates that more female patients were died than male lung cancer patient.  Figure 8 shows that female patients survive longer than male. However, the con dence intervals illustrate that the uncertainty is greater in survival curve for Female patients. It also indicates that the median survival probability is greater than male lung cancer patients. The objective of this analysis was to determine which potential covariates have effect on survival probability. We have used multivariate cox regression to calculate mortality rates. Table 6 shows the result of multivariable cox regression model which indicates gender had no in uence on survival outcome (HR ~= 0.81, 95%CI: 0.56 to 1.16, p=0.0.247). However, taking surgery and taking immune system are statistically signi cant prognostic factors for lung cancer patients. The model indicated that the risk of mortality increases by 92% if lung cancer patients do not take surgery (HR ~= 1.92, 95%CI: 0.31 to 0.97, p=0.039). Furthermore, the risk of mortality is reduced by 44% among those patients who took immune system. The proportional hazard assumptions were checked to investigate whether the hazard ratio is approximately proportional using both Log-log plot and a plot of log hazard ratio over time (i.e., the ratio of PH is approximately constant over time or whether it is time-dependent).
Log-log plot of left-hand side in gure 9 for surgery shows that the hazards are approximately proportional since the difference between two lines are approximately constant. The plot of log hazard ratio with time shows the approximate straight lines illustrating that hazard ratio remains constant with time.
As for immune system, the hazards are not proportional since the difference between two lines are not constant. Also, the plot of log hazard is not constant over time. Thus, the proportionality hazard assumption for immune system is not met.
3.11. The outcome of the most common treatment regime: Surgery, Chemotherapy and Radiotherapy alone and in combination Data are sorted for repetitions of main three treatment regime alone and in combination and put in

Conclusions
During this study, it was found that the epidemiological data are not different from similar data on lung cancer reviewed. This is expected on light of similarity in environmental in uences, socio-economic and genetic background of the populations studied. Also, the treatment of choice is chemotherapy, alone and in combination with other regimes constituted 90.16%, surgery alone and in combination with 73.70% and the least used is radiotherapy alone and in combination with 34.90%. Further, female patients survive longer than males and median survival probability in female is greater than male lung cancer patients.
Besides the above results, we showed that surgery and immune system are statistically signi cant prognostic factors for lung cancer patients. The Relative Risk Ratio (RRR) revealed that among the gender and age groups, males are at higher risk than females to have Lung cancer up to the age of 40 then the trend changed direction where females show higher risk than males from the age 40 and it doubled at the age group of 50-60. This might be attributed to early smoking in males and age effect in females. Lung cancer incidence grows with age and covers all ages and both sexes from birth to death. During this study we studied a large spectrum of 590 cases of Lung Cancer admitted to Nanakali hospital, Erbil province of Iraqi Kurdistan. All of our investigations and assessments were along with satisfaction of the mentioned ills with respect to the ethical rules and under supervision of Nanakali hospital.
-Animal Ethics statements Not applicable because no animal investigated in our study.
-Data availability statement The data used in the manuscript as indicated in supporting information were taken from the Department of Lung cancer, Nanakali hospital, Erbil-Iraq, (https://www.facebook.com/NANAKALI-HOSPITAL-1443163382588149/). These data include 590 cases of Lung Cancer admitted to Nanakali hospital collected for 5 years period between 1st January 2013 to 31st of December 2017.
29. Roninson IB. Oncogenic functions of tumour suppressor p21 Waf1/Cip1/Sdi1: association with cell senescence and tumour-promoting activities of stromal broblasts. Cancer letters.  Figure 1 schematic comparison of healthy lung and lung cancer