Development of a New Screening Method “Allele Matching Cut Off Score (AMCOS)” for Faster Kinship Analysis in Cases of Mass Disasters: A Proof of Concept Study

kinship analysis in forensic is based on calculation of respective kinship indices. But calculation of the same is possible only when the subject under identication has been associated to a particular population whose gene frequency data is available for the particular set of markers used in forensic practices. In case of the mass disasters where a huge number of individuals are to be identied, gathering the population frequency data and calculating the kinship indices can be an intricate progression requiring a lot of time and huge resources. The present study is based on allele matching score values which doesn’t require the use of allele frequency data to establish kinship. This method is based on the allele sharing approach which simply refers to the number of shared alleles (1 or 2) between two individuals; also known as identical by state (IBS) alleles which might have been inherited from a recent common ancestor in which the alleles are identical by descendent (IBD). In case of mass disasters this method can be used to narrow down the investigation by screening the number of related individuals which can further be conrmed with other tests if required. This method has been tested for various statistical parameters and has shown promising values which suggests the potential use of this method in forensic practice. This method has been tested on siblings and grandparent-grandchildren by using autosomal and X-STR markers both as the reference samples from parents can’t always be available. The present study also compares the results shown by autosomal and X-STR markers in siblings and grandparent-grandchildren identication, thereby suggesting the better set of markers for siblings and grandparent-grandchildren identication. sensitivity, specicity and accuracy were 95%, 100% and 97.5% respectively, while by X-STR the same set of B-B cases showed a sensitivity, specicity and accuracy of 100%. Similarly S-S analysis showed a sensitivity, specicity and accuracy of 100% with X-STRs and the same showed a sensitivity, specicity and accuracy of 95%, 100% and 97.5% respectively, when autosomal STRs were used for the analysis. The X-STRs have shown better values of statistical parameters in GP-GC identication cases with a sensitivity, specicity and accuracy of 100% in paternal GP-GC and 84.6%, 92.31% and 88.46% respectively in maternal GP-GC cases. Informed consent was taken all subjects and/or legal years. The study All methods were performed in accordance with the relevant guidelines and regulations. Total of 170 pairs, 50 B-S (25 test and 25 control), 40 B-B (20 test and 20 control), 40 S-S (20 test and 20 control) and 40 GP-GC (20 test and 20 control), were studied. The kinship was conrmed verbally from parents, siblings, and grandparents. Also, to ensure the sibship and grandparentage, we followed the certainty threshold for likelihood ratios and selected the pairs with kinship indices between 100-1000(or >1000) 24 . Since all the studied pairs hailed from the Punjab region of India, the population allele frequencies were calculated for the same population by using the GenAlEx 6.5 software 25 , and distant kinship (Sib ship and GP-GC) indices were calculated by using the X-STR data and FamLinkX software 26 . X-STR data was used to conrm kinship because unlinked autosomal STR markers are not ecient enough to distinguish the pedigrees like GP-GC 8,27 . samples parentage of the mother-father-child trio for siblings by paternity analysis. majority of the volunteers adults, in informed consent obtained the parents or


Introduction
Sib ship analysis plays a vital role in individual identi cation in civil and criminal law cases and in searching for a missing person when the parents are absent or dead 1 . In situations where parentage (family trio) analysis is not feasible, DNA comparison with an alleged sibling may solve the purpose of identi cation. Since there are no obligatory alleles between the siblings that can help in excluding the case with absolute certainty, sibship analyses are more complicated 2 . It is not possible to eliminate sibship with con dence by using the genetic markers if only siblings are available for the study 3 .
According to Mendelian genetics law, full siblings acquire the alleles from their parents. Probabilities that a full sibling will share 0, 1, or 2 alleles identical by descent is ¼, ½, and ¼ 4 . Studies have been conducted from time to time, to develop and access the validity of the sibling comparison test.
STR multiplex markers are the predominantly used technology for human identi cation 5 . Multiplex STR assays have been evaluated for their use in pairwise kinship analysis 6,2 . The study has been conducted on 9, 12, and 15-STR markers to develop a method for sibling identi cation 7 .
The study of kinship requires the analysis of identical by descendent (IBD) alleles 8 , and the research on genetic relatedness has always been linked to the root concept of IBD. Previous studies on sibship analysis have also been based on the idea of IBD [9][10][11][12] . By combining the IBD method with the Identical by state (IBS) information, the inference of genetic relatedness between the individuals (in pedigree and /or in large population-based studies) has also been reported by Stevens, 2011 13 . IBS term is used to describe two identical alleles at a locus between two individuals who do not share a recent common ancestry. IBS method was proposed by Chakraborty and Jin (1983) for the inference of a pairwise relationship 14 . On the other hand, IBD (identity by descendent) describes two identical alleles that share the common ancestry. Two individuals who share 1 or 2 alleles IBS at a given locus may have inherited the alleles from a recent common ancestor in which the alleles are IBD 13 . Yuan (2017) reported the application of autosomal STR loci with the IBS method, and a discriminant function algorithm has also been studied for their utility in Sibling identi cation. It was concluded from the study that STRs with higher discrimination power (PD) values should be selected when additional autosomal markers are required for full sibling identi cation and discriminant analysis with IBS was reported to be highly useful for the full sibling test 4 .
To infer a biological relationship from pairwise genetic data in loci is based on population frequencies of the observed alleles shared by the pairs of individuals and on probability equations for genotype combinations 3,15 . The Likelihood ratio (LR) is calculated by using the frequency data to express the probability ratio of relatives to non-relatives.
However, in some cases where the population frequencies of alleles may be unknown, or the ethnic origins may be unclear for foreign individuals, the LR based method fails to infer the relationship between two individuals 15 .
In this study we have used the allele sharing approach which simply refers to the number of shared alleles (1 or 2) between two individuals; also known as IBS alleles which might have been inherited from a recent common ancestor in which the alleles are IBD 13,16 and developed a new method of "Allele matching cut off score (AMCOS)." The utility of this method was checked in siblings and grandparent-grandchildren (GP-GC) identi cation. We applied AMCOS on the sibling and grandparent-grandchildren data obtained from the most frequently used autosomal and relatively newer X-STR markers. The reason for choosing X-STR markers was their increasing popularity and their promising performance in kinship testing 17,18 . The use of X-STRs has also been suggested in certain pedigree analyses, which are reported to be indistinguishable by autosomal STR analysis 8 . Also, the options for chromosome X marker typing utilizing short amplicons and the ease of analysis over mtDNA, which is an intricate process 19,20 , can be considered when degraded samples from the mass disasters are to be analyzed 21 . Autosomal STR analysis was chosen because the use of unlinked biallelic markers has always been a worldwide standard practice in forensic laboratories for the last two decades 22,23 . The present study based on AMCOS values will give us a cut-off score/value based on IBS allele matches (which could be IBD also) between the siblings and grandparents-grand children. The cut-off score/value can be used to shortlist the number of individuals to be matched for sibship and GP-GC identi cation in cases of any mass disaster or natural calamity. This AMCOS based method can help the analyst to save time and resources by short listing the number of individuals to be matched for kinship establishment in Disaster Victim Identi cation (DVI), which may further be con rmed by other analyses if required.

Results
Brother-Sister (B-S) kinship analysis by autosomal STRs.
To examine the kinship analysis between brother and sister analysis using autosomal STR markers, sensitivity and 1speci city values at different allele matching scores were calculated. Independent t-Test showing a statistically signi cant difference between B-S (related) and non-B-S (unrelated) group based on OAM and TAM score were calculated. The average TAM score for the B-S group is 3.92 ± 1.754 (SD), and for the non-B-S group average, TAM is 0.80 ± 0.764 (SD). Whereas the average OAM score for the B-S group is 8.64 ± 1.846 and for the non-B-S group is 7.48 ± 1.531 (Fig. 1). Sensitivity and 1-speci city values at different allele matching scores calculated based on the ROC curve were shown in Table 1. The smallest OAM cutoff score value is the minimum observed test value minus 1, and the most considerable cutoff value is the maximum perceived test value plus 1. All the other cutoff values are the averages of two consecutive ordered observed test values (Table 1). The sensitivity and speci city of the test with AMCOS of 9 were found to be 52% and 72 %, respectively. The predictive values for positive and negative predictions were found to be 65% and 60%, respectively, and the overall accuracy of the test was found to be 62% (Table 2).  The sensitivity and speci city of the test with AMCOS of 3 were found to be 92% and 100 %, respectively. The predictive values for positive and negative predictions were found to be 100% and 93%, respectively, and the overall accuracy of the test was found to be 96% (Table 4).  The sensitivity and speci city of the test with AMCOS of 6, when applied to the B-S sibling group, was found to be 80% and 76 %, respectively. The predictive values for positive and negative predictions were found to be 77% and 79%, respectively, and the overall accuracy of the test was found to be 78% (Table 6). group, the average TAM is 0.60 ± 0.681 (SD). Only two allele matching (TAM) scores showed a signi cant difference between the two (related and unrelated) groups (Fig. 3). So only the TAM score was evaluated for its e ciency as a biomarker for brother-brother kinship analysis by autosomal STR markers. Sensitivity and 1-speci city values at different allele matching scores calculated based on the ROC curve were calculated ( Table 7). The smallest TAM score cutoff value is the minimum observed test value minus 1, and the most considerable cutoff value is the maximum perceived test value plus 1. All the other cutoff values are the averages of two consecutive ordered observed test values. The sensitivity and speci city of the test with AMCOS of 3 were found to be 95% and 100 %, respectively. The predictive values for positive and negative predictions were found to be 100% and 95%, respectively, and the overall accuracy of the test was found to be 97.5% (Table 8). Brother-Brother (B-B) Kinship analysis by X STR analysis: Independent t-Test showing a statistically signi cant difference between B-B and non-B-B group based on the OAM score were analyzed using X-STRs. The Average OAM score for the B-B group is 9.1 ± 2.075 (SD), whereas for the non-B-B group average OAM score is 1.85 ± 0.040 (SD). ROC curve to assess the accuracy of the test in discriminating against the true sibling (Brother-Brother) cases from false cases is shown in Fig. 4. Sensitivity and 1-speci city values at different allele matching scores calculated on the basis of the ROC curve (Coordinates of the curve) were shown in Table 9. The sensitivity and speci city of the test with AMCOS of 5 were found to be 100 %. The predictive values for positive and negative predictions were also found to be 100%, and the overall accuracy of the test was 100% too (Table 10). Sister-Sister (S-S) kinship analysis by autosomal STRs: Independent t-Test showing a statistically signi cant difference between S-S (related) and non-S-S (unrelated) groups based on the TAM score were calculated. The Average TAM score for S-S is 5.45 ± 1.63 (SD), whereas, for the non-S-S group, the average TAM is 0.95 ± 0.326 (SD). Only two allele matching (TAM) scores showed a signi cant difference between the two (related and unrelated) groups. So only the TAM score was evaluated for its e ciency as a biomarker for sister-sister kinship analysis by autosomal STR markers (Fig. 5). Sensitivity and 1-speci city values at different allele matching scores were calculated on the basis of the ROC curve are shown in Table 11. The smallest cutoff value is the minimum observed test value minus 1, and the most considerable cutoff value is the maximum observed test value plus 1. All the other cutoff values are the averages of two consecutive ordered observed test values. The sensitivity and speci city of the test with AMCOS of 3 were found to be 95% and 100 %, respectively. The predictive values for positive and negative predictions were found to be 100% and 95%, respectively, and the overall accuracy of the test was found to be 97.5% (Table 12). Sister-Sister (S-S) kinship analysis by X-STR analysis: Independent t-Test showing a statistically signi cant difference between S-S (related) and non-S-S (unrelated) groups based on the OAM score were analyzed. The Average OAM score for S-S is 11.85 ± 0.366 (SD), whereas, for the non-S-S group, the average OAM is 5.90 ± 2.049 (SD). ROC curve and AUC to assess the accuracy of the test in discriminating the true cases (S-S cases) from the false cases (non-S-S cases) (Fig. 6). Sensitivity and 1-speci city values at different allele matching scores calculated on the basis of the ROC curve are shown in Table 13. The sensitivity and speci city of the test with AMCOS of 11 were found to be 100%. The predictive values for positive and negative predictions were also found to be 100%, and the overall accuracy of the test was also 100% (Table 14). Grandparents -Grandchildren (GP-GC) by autosomal STR analysis: Independent t-Test showing a statistically signi cant difference (Statistically signi cant difference (p < 0.05) between GP-GC (related) and non-GP-GC (unrelated) group based on the OAM score were analyzed. The Average OAM score for GP-GC is 11.54 ± 2.64 (SD), whereas, for the non-GP-GC group, the average OAM is 8.27 ± 2.146 (SD). The ROC curve and AUC to assess the accuracy of the test in discriminating the true cases (Grand parentage cases) from the false cases (non-Grand parentage cases) (Fig. 7). Sensitivity and 1-speci city values at different allele matching scores calculated based on the ROC curve were shown in Table 15. The sensitivity and speci city of the test with AMCOS of 10 were found to be 85% and 65 %, respectively. The predictive values for positive and negative predictions were found to be 74% and 85%, respectively, and the overall accuracy of the test was found to be approx 79 % (Table 16).

Grandparents -Grandchildren (GP-GC) by X-STR analysis:
Grandparents -Grandchildren relationship are found in paternal and maternal side. Therefore in this study kinship analysis of GP-GC paternal side and maternal side studied as follows.

Independent t-Test showing a statistically signi cant difference between paternal GP-GC (related) and non-GP-GC
(unrelated) group based on the OAM score were examined. The average OAM score for GP-GC is 12 ± 0.00 (SD), whereas, for the non-GP-GC group, the average OAM is 6.57 ± 0.976 (SD). ROC curve and AUC to assess the accuracy of the test in discriminating the true cases (Grand parentage cases) from the false cases (non-Grand parentage cases) (Fig. 8). Sensitivity and 1-speci city values at different allele matching scores calculated on the basis of the ROC curve (Coordinates of the curve) were shown in Table 17. test values. The sensitivity and speci city of the test with AMCOS of 10 were found to be 100 %. The predictive values for positive and negative predictions were found to be 100%, and the overall accuracy of the test was also found to be 100% (Table 18). Independent t-Test showing a statistically signi cant difference between maternal GP-GC (related) and non-GP-GC (unrelated) group based on the OAM score were analyzed. The average OAM score for GP-GC is 8.85 ± 2.794 (SD), whereas, for the non-GP-GC group, the average OAM is 3.15 ± 1.625 (SD). ROC curve and AUC to assess the accuracy of the test in discriminating the true cases (Grand parentage cases) from the false cases (non-Grand parentage cases) were shown in Fig. 9. Sensitivity and 1-speci city values at different allele matching scores calculated on the basis of the ROC curve (Coordinates of the curve) were shown in Table 19. The sensitivity and speci city of the test with AMCOS of 6 were found to be 85% and 92%, respectively. The predictive values for positive and negative predictions were found to be 92% and 86%, respectively, and the overall accuracy of the test was found to be 88% (Table 20). Over all, above mentioned results were presented in the Table 21 and Table 22.

Discussion
In cases of mass disasters the dead bodies or their mortal remains have to be identi ed and handed over to the grieving families to perform the last rights and for other civil matters like insurance, property and job claims .The number of sample pairs to be matched in such cases is enormous and the time frame is short. Such situation calls the need of a screening method which screens out the number of sample pairs (dead and its kin) to be analyzed for relatedness, which can further be con rmed for kinship. To avoid the wastage of time and resources the present study sets a standard "allele match cut off score (AMCOS)" for both the marker sets (Autosomal and X-STR) in various kinship analyses (Table 21) for the purpose of screening out pairs out of hundreds and thousands of individuals and the dead bodies /remains to be tested for kinship in cases of mass disasters, this method is solely based on allele matches at different loci and doesn't require any allele frequency data. To make the evidence more comprehensible in the court of law, the forensic reports for Human Identi cation (HID) are presented in the form of likelihood ratios (LRs), and to calculate the LRs, allele frequencies are required for the population, which the person under-identi cation belongs to. In a diverse and developing country like India and many other developing countries, where the resources are scarce, the population data is rarely available. Also in cases of intra and inter-population migrations it gets di cult to obtain a population speci c database. The AMCOS method has been devised to be used in such a condition.
The study uses two set markers, autosomal and X-STRs, for the same set of kinship analyses (B-S, B-B, S-S, and GP-GC). In B-S analysis by autosomal STR the signi cant TAM score of 3 and OAM score of 9 was found to be 92% sensitive with a speci city of 100% and accuracy of 96%. On the other hand OAM of 6 was found to have sensitivity, speci city and accuracy of 80%, 76% and 78% respectively, when B-S analysis by performed by X-STR. In B-B analysis by autosomal STR, the sensitivity, speci city and accuracy were 95%, 100% and 97.5% respectively, while by X-STR the same set of B-B cases showed a sensitivity, speci city and accuracy of 100%. Similarly S-S analysis showed a sensitivity, speci city and accuracy of 100% with X-STRs and the same showed a sensitivity, speci city and accuracy of 95%, 100% and 97.5% respectively, when autosomal STRs were used for the analysis. The X-STRs have shown better values of statistical parameters in GP-GC identi cation cases with a sensitivity, speci city and accuracy of 100% in paternal GP-GC and 84.6%, 92.31% and 88.46% respectively in maternal GP-GC cases.
The outcome of the study shows that X-STRs seemingly performs better relating to the statistical parameters like sensitivity, speci city in B-B, S-S, and GP-GC identi cation cases, while the negative predictive value (NPV), and positive predictive value (PPV) remained the same with both autosomal and X-STRs. Whereas, autosomal STR analysis showed better values of all the statistical parameters in B-S identi cation cases. We tried the AMCOS method for GP-GC screening by autosomal STR analysis, which otherwise is reported to be indistinguishable by the unlinked autosomal markers with LR based methods. The AMCOS method gave fairly good values of statistical parameters with GP-GC identi cation cases. Though, the technique needs to be validated in a larger sample size of GP-GC pairs. The present study is a proof of concept based study and needs to be con rmed in a larger sample size of siblings and GP-GC.
To the best of the author's knowledge, AMCOS based method has never been used earlier to establish the kinship. The present study shows the successful application of AMCOS method to identify siblings and GP-GC relationships. The results support the potential use of this technique in forensic settings to identify siblings and GP-GC. Besides that, the present study has compared the results of X-STR and autosomal STR analysis in the same samples concerning statistical and forensic parameters and has suggested the use of a better set of markers for the above mentioned kinship analyses in question.

Material and methodology
The study was commenced after taking ethical clearance from the internal ethical committee of Post Graduate  20 control), were studied. The kinship was con rmed verbally from parents, siblings, and grandparents. Also, to ensure the sibship and grandparentage, we followed the certainty threshold for likelihood ratios and selected the pairs with kinship indices between 100-1000(or >1000) 24 . Since all the studied pairs hailed from the Punjab region of India, the population allele frequencies were calculated for the same population by using the GenAlEx 6.5 software 25 , and distant kinship (Sib ship and GP-GC) indices were calculated by using the X-STR data and FamLinkX software 26 . X-STR data was used to con rm kinship because unlinked autosomal STR markers are not e cient enough to distinguish the pedigrees like GP-GC 8, 27 .
Although samples were collected from all the known families, the parentage of the mother-father-child trio was con rmed for both siblings by paternity analysis. In the case of siblings and grandchildren, the majority of the volunteers were adults, and in the case of children, written informed consent was obtained from the parents or grandparents.       Independent t-Test showing a statistically signi cant difference between S-S (related) and non-S-S (unrelated) groups based on the TAM score were calculated. The Average TAM score for S-S is 5.45 ± 1.63 (SD), whereas, for the non-S-S group, the average TAM is 0.95 ± 0.326 (SD). Only two allele matching (TAM) scores showed a signi cant difference between the two (related and unrelated) groups. So only the TAM score was evaluated for its e ciency as a biomarker for sister-sister kinship analysis by autosomal STR markers.

Figure 6
Independent t-Test showing a statistically signi cant difference between S-S (related) and non-S-S (unrelated) groups based on the OAM score were analyzed. The Average OAM score for S-S is 11.85 ± 0.366 (SD), whereas, for the non-S-S group, the average OAM is 5.90 ± 2.049 (SD). ROC curve and AUC to assess the accuracy of the test in discriminating the true cases (S-S cases) from the false cases (non-S-S cases).

Figure 7
Independent t-Test showing a statistically signi cant difference (Statistically signi cant difference (p<0.05) between GP-GC (related) and non-GP-GC (unrelated) group based on the OAM score were analyzed. The Average OAM score for GP-GC is 11.54 ± 2.64 (SD), whereas, for the non-GP-GC group, the average OAM is 8.27 ± 2.146 (SD). The ROC curve and AUC to assess the accuracy of the test in discriminating the true cases (Grand parentage cases) from the false cases (non-Grand parentage cases)

Figure 8
Independent t-Test showing a statistically signi cant difference between paternal GP-GC (related) and non-GP-GC (unrelated) group based on the OAM score were examined. The average OAM score for GP-GC is 12 ± 0.00 (SD), whereas, for the non-GP-GC group, the average OAM is 6.57 ± 0.976 (SD) . ROC curve and AUC to assess the accuracy of the test in discriminating the true cases (Grand parentage cases) from the false cases (non-Grand parentage cases) Figure 9 Independent t-Test showing a statistically signi cant difference between maternal GP-GC (related) and non-GP-GC (unrelated) group based on the OAM score were analyzed. The average OAM score for GP-GC is 8.85 ± 2.794 (SD), whereas, for the non-GP-GC group, the average OAM is 3.15 ± 1.625 (SD)

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download. supplementrymaterialTableS1S8edited.docx