Comparison of the international terminologies SNOMED CT, LOINC and ICD-11: Applicability in a guideline-conform pathology vocabulary for urothelial carcinoma in Germany CURRENT UNDER REVIEW

Objective With almost 30,000 new cases per year, urothelial carcinomas account for a significant proportion of cancer cases in Germany. Respective guidelines serve to help pathologists evaluate tumor material according to international classification standards, but to ensure interoperability, further regulations are required. Therefore, the study presented in this work aimed at improving the informational situation by evaluating the applicability of the international terminologies in the scope of urothelial carcinoma in Germany. on a collection of terms recommended for pathology was Nomenclature

3 on the respective medical field to ensure reliable mappings.

Background
Cancers of the ureter and the urinary bladder are among the most common cancer-related diseases [1], being highly associated to smoking or exposure to certain agents at work [2].
In Germany they account for nearly 30,000 new cases per year and accordingly, specific guidelines address the challenge of providing guidance in order to describe a specific tumor, its location, etc. unambiguously [3]. However, with regard to the progressing digitalization and international data exchange, further requirements, especially a conjointly used vocabulary, need to be postulated to promote and ensure interoperability between the different systems used by the various stakeholders. This would not only reduce the risk of misunderstandings due to transmission errors (such as harbored by illegible handwriting or low-quality facsimiles), but also allow for automation of conversion steps (e.g. when transferring data from the lab into the hospital information system and/or into the electronic medical record), resulting in time and effort saved.
The work presented in this study focused on examining the applicability of selected international terminologies as possible means to convert the S1-guideline "Urothelkarzinom" (urothelial carcinoma) into an interoperable electronic form [4]. In detail, the following terminologies were evaluated with regard to mappability of required terms as well as their respective equivalence: · SNOMED CT (Systematic Nomenclature of Medical Terms -Clinical Terms) · LOINC (Logical Observation Identifiers Names and Codes) · ICD (International Classification of Diseases) These terminologies were chosen since they are the ones most commonly used on an international level, although, as depicted in table 1, they differ in their intended utilization. For example, the ICD classification originally derived from the need to 4 summarize mortality-related diseases for statistical purposes while the focus of LOINC is on the exchange of clinical observations. While ICD, as of June 2019, is still used officially in form of the 10th revision (ICD-10), the updated and fundamentally improved 11th revision (ICD-11) will replace it within the next years and was accordingly used in this study.

Methods
Analysis of the S1-guideline "Urothelkarzinom" In this context, terms were considered as relevant when required for the grading determined by the WHO [5], the TNM-classification or to specify the location of the tumor [6], give information on previous therapies, etc..

Mapping
Prior to mapping, all terms were translated into English and an online research was performed for each one to confirm that the translation was valid, i.e. standardly used in clinical publications or reports. In case that an expression was unique to German language, the most appropriate translation was used according to the best of the authors' knowledge and belief.
Mapping to SNOMED CT, ICD-11 and LOINC then was performed online using the browser provided by SNOMED International, the World Health Organization International and the Regenstrief Institute, respectively [7][8][9].
Initial mappings were performed by three of the authors, all with different background regarding terminologies/standards and expertise in pathology/medicine. Author A had knowledge on using all three terminologies, but no significant expertise related to pathology, author B is an expert in pathology with basic knowledge on terminologies and author C is an expert in terminologies with laboratory-medical background. These mappings were used to assess the accessibility of each terminology, i.e. whether the authors identified identical codes or differed (the respective assessment considered only terms where at least one of the authors proposed a code).
Lastly, the authors consented on the final mapping, which then was used for further analysis.

Equivalence evaluation and inter-rater reliability
In brief, to each mapping a number between 0 and 4 was assigned, according to equivalence and considering determinants specified in the standard ISO/TS 21564 provided by the International Organization of Standardization. The respective classification was as follows: · 0 : Exact semantic matching (code equals term) · 1 : Complete overlap of the semantic domain (code covers term, but also more) · 2 : Incomplete overlap of the semantic domain (code partially covers term) · 3 : Rather a comparison than overlap of the semantic domain (code represents similar domain or term) 6 · 4 : No overlap of the semantic domain (no appropriate code found) An example for an exact match (ISO 0) would be the term "Klinische T-Kategorie", translated as "clinical t-category", for which both, SNOMED CT and LOINC, provided equivalent codes ("399504009 | cT category (observable entity) |" and "21905-5 Primary tumor.clinical [Class] Cancer", respectively).
For ISO 1, an example would be "Vorangegangene endoluminale Chemotherapie" (previous endoluminal chemotherapy). Here, LOINC provided the code "81167-9 Cancer treatment -preoperative |", which has a broader scope than the original term. Furthermore, for "Urothelkarzinom" (urothelial carcinoma), the only applicable code found in LOINC was "66125-6 Urinary bladder Pathology biopsy report". Since this only covers a part of the original term, it was considered as ISO 2.
Finally, for "Andere Angaben zum Tumortyp" (Other Information on tumor type) a single, partially suitable code was found in LOINC: "52535-2 Other useful information". As this only represents a comparable concept, it accordingly was classified as ISO 3.
Consequently, the lower the average equivalence figure, the better the general usefulness in appropriate clinical environments.
To ensure validity of the ISO-rating, three of the authors performed equivalence evaluations independently and Fleiss' Kappa, as measurement of the inter-rater correlations, was calculated for each terminology (note that only appropriate codes, i.e. ISO 0-3, were considered). Afterwards, terms were balloted for a definitive ISOclassification if necessary, i.e. if evaluations varied between the raters.

Mapping
Evaluation of the S1-guideline "Urothelkarzinom" revealed 72 terms that were used for concepts, with an additional 96 terms for values. This total of 168 terms then was assessed for mappability to appropriate codes in SNOMED CT, LOINC and ICD-11, respectively.
The initial, independent mappings revealed that results varied with the respective author performing the mapping. As depicted in table 2 (part A), in general author A (knowledge on standards) mapped significantly less terms than authors B (expert medical background) or author C (expert knowledge on standards plus laboratory-medical background). As an explanation, it can be postulated that in general the authors decided to rather omit potentially unfitting mappings if in doubt, clearly stressing out the need for appropriate background knowledge. These results also reflect that, especially with regard to concepts, SNOMED CT and LOINC offer more potentially applicable codes, but are consequently more complex, as demonstrated by the respective lower uniformity. In contrast, all three authors identified the same codes for the same concepts when using ICD-11, albeit significantly less than with the other two terminologies. Prior to further assessment, terms that were not mapped identically were discussed and consented on, resulting in a final mapping. As also depicted in table 2 (part B), the final mappability varied for each terminology, ranging from 13.89% to 95.83% regarding terms for concepts and from 19.79% to 97.92% regarding terms for values.

8
The data also clearly demonstrate that SNOMED CT had the highest compatibility in any aspect (with a coverage of 97.02% for the total terms), followed by ICD-11 (with 41.67% coverage), while LOINC had the lowest (with 39.29% coverage).

Equivalence evaluation and inter-rater reliability
While for most terms a corresponding code could be found in at least one of the three terminologies assessed, it had to be evaluated whether these codes corresponded to the original meaning. Therefore, an equivalence assessment was performed in accordance with ISO/TS 21564. As depicted in  Regarding average equivalence and Fleiss' Kappa, as means to measure the overall applicability of a terminology as well as its reliability, respectively, significant differences were observed. Here, SNOMED CT and LOINC both demonstrated a better average equivalence regarding the mapped terms than ICD-11 (with 0.3374 and 0.4848 vs. 1.1143) while, vice versa, the overall reliability (Fleiss' Kappa) was higher for the later. Note that the high reliability for ICD-11 correlates directly to its lower equivalence, as "obviously" non-equivalent mappings (ISO 1-3) are more likely to be evaluated identically as such by all raters.

Coverage
The results from the mapping already indicate that none of the terminologies alone is sufficient to cover all terms needed. Therefore, coverage by different combinations was assessed. This analysis also includes the different degrees with regard to equivalence, although it hast to be noted that only a rather strict equivalence (i.e. ISO 0-1) is feasible and legit in clinical use.

Results are given as absolute numbers as well as percentage for each combination and for different ranges of equivalence level.
As displayed in table 4, since SNOMED CT already cover most of those terms mappable by LOINC and ICD-11, even the combination of all three terminologies only results in a minor improvement with 137 (81.55%) terms at high equivalence. (In brief, LOINC can provide further 10 and ICD-11 only 1 terms, respectively.) It is also notable that combining LOINC and ICD-11 provides mapping only for 65 (38.68%) of the required ISO-0 terms, which is significantly less than SNOMED CT alone (127 terms or 75.60%, respectively; see table   2B). When lower equivalence is accepted (i.e. ISO 0-3), the contribution by LOINC and ICD-11 gets even less with only LOINC adding 1 single term to 163 terms covered by SNOMED CT.

Discussion
As indicated by the results presented in this study, as of now, none of the terminologies assessed is sufficient to cover definitely all terms needed in the S1-guideline "Urothelkarzinom" on its own. Even with low-equivalent expressions (ISO 0-3) accepted, codes only for 164 (97.62%) of the 168 terms required could be mapped when combining SNOMED CT, LOINC and ICD-11. Though, for unambiguous clinical use, restriction to at least ISO 0-1 (as required in the ISO/TS 21654) or even solely ISO 0 would be mandatory, resulting in a coverage of only 157 (93.45%) or 138 (82.14%) suitable terms, respectively.
Among the terminologies applied in this study, SNOMED CT clearly is the most promising one, as it alone covers already 127 terms (75.60%) at ISO-0 equivalence. (Interestingly, this is nearly identical to the coverage previously reported for its application in histopathological finding) [10]. In comparison, the combination of LOINC and ICD-11 without SNOMED CT resulted in only 118 (70.24%) terms even for low equivalence, and 65 (38.69%) for absolute equivalence.
The main reason for SNOMED CT offering a higher coverage and equivalence is likely due to its flexibility. By applying post-coordination, codes can often be precisely defined to exactly match a specific term. This, however, requires detailed knowledge on the hierarchy and attributes allowed (especially the extensions used in post-coordination or semantic tags) [11,12], as well as rigorous quality control, as the results from the initial mapping stress out. Especially when coding terms for more complicated areas, such as the pathology vocabulary used in this study, it is mandatory that the responsible coder has sufficient knowledge not only on the terminologies and their requirements, but also on the medical terms and their specific, detailed meanings and context. For example, when coding a tumor relapse, the concept "Recurrent tumor" benefits from respective extensions to be unambiguously defined as stand-alone entry, referring to a specific tumor in the patient's history. Without post-coordination, the general context is required in order to interpret the concept correctly. This might be negligible for complete electronic reports (since all data are represented in the according context), but is crucial for electronic evaluation or search in clinical studies (e.g. when searching for data on relapse of a specific tumor type).
While these results also strongly emphasize the current problems associated with the progressing digitization and the resulting need for interoperability, it has to be mentioned that SNOMD CT and LOINC are under constant development [13]. In addition, since 2012, there are constant efforts regarding the harmonization between these two terminologies.
Thus, coverage and subsequently applicability might improve in future but, of course, the responsible organizations are also required to adapt or develop national guidelines in accordance to international standards whenever possible (e.g. by requesting codes for national extensions).
In addition, be reminded that the focus of this study was solely on urothelial carcinomas. Therefore, it does not necessarily reflect applicability to reports of other cancers, although similar results can be expected. On the side, the results also confirm that mappability does not necessarily correlate with semantic equivalence. While in this setting SNOMED CT had both, the highest mappability as well as the best equivalence, there was a notable difference in between LOINC and ICD-11. While both provided comparable amounts of applicable codes (66 and 70, respectively), for those codes found LOINC had a significantly better average equivalence (0.4848) than ICD-11 (1.1143), which also was comparable to that of SNOMED CT (0.3374).

Conclusions
In brief, the results of this study can be summarized as follows: · Combining SNOMED CT, LOINC and ICD-11 is a feasible approach to compile a guidelineconform pathology vocabulary for urothelial carcinomas based on international standards in Germany.
· While SNOMED CT alone already provides most of the terms needed, supplementation with codes from LOINC/ICD-11 is still required. · Solely combining LOINC and ICD-11 covers less than 50% of the terms needed and as 12 such is regarded not feasible. · Basic knowledge on the standards alone is not sufficient for complex medical domains.
The personnel responsible for mapping also needs experience in post-coordination as well as knowledge of the specific medical field in order to choose the correct and most equivalent codes.

Declarations
Ethics approval and consent to participate Not applicable.

Consent for publication
Not applicable.

Availability of data and material
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Competing interests
The authors declare that they have no competing interests.

Funding
Not applicable.

Authors' contributions
This paper was jointly conceived and written by the authors, with FVW and TFR having contributed equally.