Timing and co-occurrence of symptoms prior to a diagnosis of light chain (AL) amyloidosis

It is well-established that light chain (AL) amyloidosis patients have multi-organ involvement and are often diagnosed after a lag period of increasing symptoms. We leverage electronic health record (EHR) data from the TriNetX research network to describe the incidence, timing, and co-occurrence of precursor conditions of interests in a cohort of AL amyloidosis patients identified between October 2015-December 2020. Nineteen precursor diagnoses of interest representing features of AL amyloidosis were identified using ICD codes up to 36 months prior to AL amyloidosis diagnosis. Among 1,401 patients with at least 36 months of EHR data prior to AL amyloidosis diagnosis, 46% were females, 16% were non-Hispanic Black, and 6% were Hispanic. The median age was 71 (range, 21–91) years. The median number of precursor diagnoses was 5 with dyspnea and fatigue being the most prevalent. The time from the first occurrence of a precursor to AL diagnosis ranged from 3.2 to 21.4 months. Analyses of pairwise co-occurrence of specific diagnoses indicated a high association (Cole’s coefficient > 0.6) among the examined precursor diagnoses. These findings provide novel information about the timing and co-occurrence of key precursor conditions and could be used to develop algorithms for early identification of AL amyloidosis.


Background
Light chain (AL) amyloidosis is a rare plasma cell disorder characterized by extracellular tissue deposition of misfolded and aggregated amyloid brils derived from clonal immunoglobulin-free light chains.(1) The majority of patients with AL amyloidosis present with multisystemic involvement, where heart, kidneys, gastrointestinal tract, nervous system, and musculoskeletal system often affected.(2) The disease often entails distinct and disparate symptoms across different organ systems over time leading to healthcare visits with multiple specialists.(3) Existing literature suggests that more than one-third of patients report symptoms for a year or longer and approximately half sought four or more different physicians before their AL amyloidosis diagnosis was formally established.(3) Literature and clinical experience suggests a substantial delay in diagnosis from initial onset of symptoms.(3)(4)(5)(6)(7) The delay in AL amyloidosis diagnosis leads to many patients being diagnosed with advanced organ involvement, often associated with poor prognosis.(1) Reducing the time from the onset of precursor conditions' symptom to the diagnosis for AL amyloidosis is a critical unmet need.Taken together, the low incidence of AL amyloidosis, the non-speci city of its presenting symptoms, and the resulting reliance on numerous different healthcare providers to address them make disease diagnosis a complex task.For these reasons, disease awareness and a high suspicion by the provider physician are key elements to making a diagnosis of AL amyloidosis.In this paper, we contribute toward that goal by examining the incidence, timing, and co-occurrence speci c conditions likely to be symptoms related to AL amyloidosis (we refer them as precursor diagnoses).
Speci cally, we leverage a large and diverse electronic health record (EHR) dataset to describe the timing of nineteen speci c clinical precursor diagnoses and their co-occurrence within the three years prior to the patients' diagnosis of AL amyloidosis.

Data source
Data for this observational retrospective cohort study of patients diagnosed with AL amyloidosis were drawn from TriNetX.TriNetX is a health research network providing access to high-quality, de-identi ed patient-level EHR data from more than 60 U.S. healthcare organizations.These data include diagnoses, visits, prescriptions, procedures ordered, vital signs, laboratory values, and are refreshed on a regular basis.Variable De nitions.The precursor conditions of interest, along with their ICD codes, are listed in Table 1.

Cohort identi cation
These were categorized by organ system as clonal, cardiac, renal, gastrointestinal, multisystemic, and neurologic, with a residual category of miscellaneous.Time from precursor condition to AL diagnosis was calculated based on the date of the earliest medical encounter with a code for the speci c precursor diagnosis to the date of AL amyloidosis diagnosis.

Statistical analysis
The onset of AL amyloidosis diagnosis was considered as time 0 and the EHR period preceding time 0 was shown 6 months prior, 12 months prior, 24 months prior, and 36 months prior to time 0. By cohort de nition, all patients had a minimum of 36 months of EHR data prior to time 0.
The rst period, 36 months to 24 months prior to time 0 was considered as the prevalence period to determine the baseline prevalence of the precursor diagnoses within the cohort assuming that it may not be due to AL amyloidosis.Starting 24 months prior to time 0, the new appearance of precursor diagnoses was considered as the incidence.For each precursor diagnosis, the incidence was calculated by determining the proportion of patients who had a new occurrence of the speci c precursor diagnosis code starting 24 months prior to the AL amyloidosis diagnosis code to the rst occurrence of AL diagnosis.The median time between the rst instance of each precursor diagnosis to time 0 was estimated via a kernel-density estimation and using a scaled probit transformation to account for the boundary restrictions at -36 months and time 0.
The pairwise co-occurrence between precursor diagnoses was calculated using the Cole's coe cient (CC).
(8) This coe cient measures the degree to which the observed proportion of joint occurrences exceeds or falls short of the proportion of joint occurrences expected by chance alone.(9) It equals 0 when there is no association between the events, achieves the value of + 1 when one event is a subset of the other.Given the exploratory nature of our analysis, correlations greater than 0.6 were considered strong.Fisher's exact test was used to assess the statistical signi cance of the co-occurrence.

Results
There As illustrated in Fig. 1, the incidence of precursor conditions within 2 years of AL amyloidosis diagnosis varied between 0.14% (macroglossia) to 34% (dyspnea).The estimated probability density of each precursor diagnosis onset by time before AL amyloidosis diagnosis are provided in the supplemental gures showing the density in the entire 24 month prior (Suppl Fig. 1) and preceding 6 months prior to AL amyloidosis diagnosis when the appearance of new precursor diagnosis is the greatest (Suppl Fig. 2).

Discussion
The diagnosis of AL amyloidosis is often delayed despite patients reporting multiple symptoms and seeing different specialists for care over several months to years.This context underscores the critical unmet need for reducing the time from the initial onset of symptoms to the diagnosis of the disease.To address this need, we leveraged a large EHR dataset to investigate the timing and co-occurrence of speci c precursor diagnoses occurring before the diagnosis of AL amyloidosis.We were particularly interested to understand when and how certain precursor diagnoses such as dyspnea, fatigue, edema, pain, proteinuria, among others were established as diagnoses within the medical history as a diagnostic code in relation to the AL amyloidosis diagnosis.These precursor diagnoses were derived from symptoms and signs endorsed by many AL amyloidosis patients.(4,10) We were interested in understanding whether these get catalogued as diagnoses within medical history by ICD codes and if so, how early before the diagnosis of AL amyloidosis.By identifying the proportion of patients with these precursor diagnoses and examining their timing and co-occurrence, our study sheds light on the diagnostic process using EHR data in this rare multisystemic condition.
Our prior work suggests that AL amyloidosis patients have a high prevalence of precursor diagnoses.(6) In the current analyses, we studied the pattern and timing of these diagnoses prior to the diagnosis of AL amyloidosis using the same data source.Concordant with ndings by others (7)  When assessing co-occurrence of precursor diagnoses, the strongest correlation was often seen with precursor diagnoses within the same organ system/category, for e.g., cardiomyopathy and heart failure, or nephrotic syndrome and renal disease.This was concordant with expected AL amyloidosis pathology, in that, with organ involvement, the disease would be expected to cause multiple symptoms and signs related to that organ system.Other precursor diagnoses belonged to different organ systems/categories but made intuitive sense as a downstream effect of one of the precursors e.g., autonomic neuropathy and syncope or macroglossia and dysphagia.
The majority of AL amyloidosis patients have more than one organ involvement, thus correlations between organ systems was of greater interest in our analysis.Here we saw numerous strong correlations across various organ system categories, autonomic neuropathy and dyspnea, neuropathy and purpura, fatigue and purpura, fatigue and autonomic neuropathy.The two organ systems which showed the greatest correlation included cardiac and gastrointestinal, cardiac and multisystemic, gastrointestinal and other, neurologic and multisystemic, and neurologic with other.
It is crucial to acknowledge and consider the limitations of our study, inherent to the use of EHR data in research, when interpreting the results.For example, misdiagnoses, coding errors, and variations in which symptoms are recorded can lead to incorrect associations or missed precursor diagnoses.Different healthcare organizations within the TriNetX network may have variations in EHR systems and diagnostic coding practices, thus limiting the generalizability of the ndings.The early symptoms of AL amyloidosis are nonspeci c and can mimic other, more common, conditions.Our study assumes that precursor diagnoses represent early symptoms of AL amyloidosis, but we have not adjusted for comorbidities that may drive the onset of many of the precursor diagnoses.For example, presence of diabetes may lead to neuropathy, proteinuria, and cardiomyopathy.Our approach in selecting a cohort with at least 3 years of backward medical history allows us to assess a baseline prevalence period in the rst year of EHR history and then incidence in the subsequent two years preceding the AL amyloidosis diagnosis.Lastly, we lack detailed clinical context making it challenging to understand the severity and clinical signi cance of precursor diagnoses.Nevertheless, our approach is relevant because it is re ective of symptoms that are clinically recognized by healthcare providers.A big strength of our analysis is the racial diversity of our dataset often lacking in published clinical research in AL amyloidosis from the U.S. (11) In conclusion, leveraging real time EHR data enabled us to identify a large and diverse cohort of AL amyloidosis patients from which to examine diagnostic patterns and demonstrate the potential for earlier diagnosis of this complex disease.Our ndings lay the foundation to develop clinical algorithms using ICD codes aimed at earlier recognition of AL amyloidosis.

Declarations
Author Contributions: authors contributed toward study design, analysis, review of ndings, and manuscript writing.
Figures  Co-occurrence of precursor diagnoses The extent of co-occurrence of diagnoses was quanti ed using Cole's coe cient.It equals 0 when there is no association between the events, achieves the value of +1 when one event is a subset of the other, and a value of -1 if the events never co-occur.Fisher's exact test was used to assess the statistical signi cance of the co-occurrence.
: As a rst step, we selected a cohort of individuals with AL amyloidosis by identifying patients who had at least one inpatient or at least two outpatient visits with an association AL amyloidosis International Classi cation of Diseases (ICD) diagnosis code (ICD-9: 277.30, 277.39 or ICD-10: E85.81, E85.4,E85.89, E85.9) during the study period between 10/01/2015-12/31/2020.The date of the earliest of these occurrences was used to indicate the time of the patients' formal AL amyloidosis diagnosis.In order to make the cohort speci c to individuals with AL amyloidosis who might have been picked by the amyloidosis unspeci ed ICD9 code E85.9, we further required patients to have received chemotherapy or autologous BMT within − 90 to + 365 days of the AL amyloidosis diagnosis for the study.We restricted the sample to individuals for whom there was information on healthcare utilization in TriNetX dating back to at least 3 years from their AL amyloidosis diagnosis date.Our choice of look back period was consistent with the prior literature indicating that nearly 90% of AL amyloidosis patients report initial symptoms within 3 years and 80% within 2 years of their AL amyloidosis diagnosis(3).

Figure 1 Development
Figure 1

Table 1 .
ICD codes for the precursor diagnoses of interest

Table 4
Cole's co-e cient values for co-occurrence between precursor diagnoses and as reported in patient surveys,(3) our analysis con rms the high prevalence of several symptoms of the disease present and diagnosed in EHR seen in this disease well before the diagnosis of AL amyloidosis has been made with a median time of 3.2 to 21.4 months before AL amyloidosis diagnosis, providing support to efforts to develop predictive algorithms toward early diagnosis.The most common precursor diagnoses include dyspnea, fatigue, heart failure, edema, altered bowels, neuropathy, and chronic kidney disease.Fatigue is the most common symptom of the disease as reported by 80% of AL amyloidosis patients.(4)Our data show that fatigue is also the most catalogued of the precursor diagnoses as an ICD code, seen in 45% of patients with median time 15.6 months before the diagnosis of AL amyloidosis.Other common AL amyloidosis symptoms including dyspnea and edema are also commonly identi ed as ICD codes by healthcare providers at one year or longer prior to the diagnosis of AL amyloidosis.This concordance with known symptoms of the disease document the feasibility to using EHR data of diagnosis codes toward creating algorithms that could improve time from symptom onset to AL amyloidosis diagnosis.