Study population criteria: Population-based risk assessment: GMA (Adjusted Morbidity Groups)
The CMC population was identified from the entire resident population of Catalonia under the age of 15 in 2016 (1,189,325, 52% boys). A morbidity, complexity, and risk score tool, Adjusted Morbidity Groups classification (Catalan acronym GMA)17, was used to identify the CMC population.
The GMA predicts an individual patient score according to their comorbidity and complexity. The higher the GMA score, the greater the individual’s medically complex conditions. This scoring is used to stratify the population for the purposes of health planning17,18. It is more accurate and yields less variability than other health risk tools, such as Clinical Risk Group (CRG)18. Information on comorbidity and complexity to construct GMA was gathered from the Catalan Health Surveillance System (CCHS) database, for present and previous years, and so on.
A clinical complexity stratification has been established according to GMA percentiles (50% very low risk, 75% low risk, 85% moderate risk, 90% high risk, 99% very high risk, 99,5% extreme risk).
Those in the top 0.5% based on GMA score (5,950 children) were selected as CMC population since: 1) this is the level of highest complexity proposed by the GMA; 2) previous studies in Catalonia found that 0.3% of the population were CMC13; and 3) concordance with the prevalence of CMC in other population studies19. The 99.5% of the rest of the children, the non-CMC population, were used as a comparative group.
Data
We used two main sources of data:
The central registry of insured persons (Catalan acronym RCA) was used to obtain the reference population (as of January 1, 2016), their income level, employment status, and Social Security benefits.
The CCHS database, for clinical purposes, includes detailed information on sociodemographic characteristics and medical diagnoses at an individual level in all contacts in primary care, emergency, mental health, long-term care services, and pharmacy prescription. It includes the total population of Catalonia, since all citizens are granted universal health coverage.
Variables
Outcome variable:
The principal outcome variable was the different clusters obtained by grouping patients with similar patterns of comorbidity.
Medical diagnoses were obtained through CCHS (coded using the Agency for Healthcare Research and Quality’s Clinical Classification Software (CCS))20 and were considered in order to determine the comorbidities registered from 2014 to 2016 in each CMC child. Only the first diagnosis of each type within each individual was included.
Exposure variable and covariates:
SEP was created on the basis of the employment status, individual income, the receipt of welfare assistance of one of the child’s parents or guardians from de RCA database. SEP was grouped into three categories: Low (no member of the household employed or in receipt of welfare support from the government, and an income <€18,000/year, considered poverty21); Middle (employed with an income <€18,000); and High (in employment, with income >€18,000).
Age was categorized based on clinical criteria for children’s growth (0–1, 2–4, 5–11, 12–14). Sex was a stratification variable.
Statistical analysis
A descriptive analysis was carried out of both the CMC and non-CMC populations.
Afterwards, in order to determine patterns of pathology in CMC population, diagnoses related to each individual, coded through CCS, were used. Homogeneous groups of patients with similar diseases were identified in CMC population via K-means cluster analysis using the Jaccard similarity index22, as in similar studies23.
Due to the large number of CCS in each individual, only the most disabling ones were selected based on the following criteria23: a prevalence > 0.5%, being five times more prevalent in the CMC than non-CMC, and a median of GMA >5 for the whole population. After this, a joint correspondence analysis was performed before the K-means procedure. CCS with correlations smaller than the median in each dimension were rejected or aggregated with similar24; some were recovered because of their clinical or socioeconomic relevance. In total, 93 of 279 CCS were selected (See Supplementary Table 1, Additional File 1).
The final number of clusters was determined by Calinsky-Harabasz’s criterion25,wich take into account minimum dispersion criteria to evaluate the optimal number of clusters. Each individual belonged to a single cluster. Finally, the clusters identified were interpreted and named under the supervision of a paediatrician. Logistic regression models, adjusted for age, were fitted for all children in Catalonia to test for association between SEP and to be allocated or not to each CMC cluster obtained by sex. Data analysis was conducted using STATA V.14/SE 201426.