Mapping priority neighborhoods: A novel approach to cluster identification in HIV/AIDS population


 BackgroundUrban disadvantaged neighborhoods have higher HIV risk behavior and higher levels of AIDS-related mortality. Studies demonstrate that interventions at the community level focusing on risk groups have increased success rates than individual patient-based based management in the context of HIV/AIDS. We tested a novel approach to identify population groups in need of greater public health efforts to achieve UNAIDS 90-90-90.MethodsWe extracted retrospective data on 2141 HIV/AIDS patients, recruited from 1997-2017 in the regional hospitals in French Guiana. Self-organizing maps were constructed and clusters were identified based on demographic and socioeconomic variables such as age, sex, CD4 counts at Nadir, type of neighborhood, unemployment rate, and presence of opportunistic illness such as Histoplasmosis and Hepatitis B in the sample population.ResultsNeighborhood unemployment rates were identified to have a large impact in the distribution of HIV/AIDS. Also, the risk of disseminated histoplasmosis, the most common AIDS-defining illness in French Guiana, was not associated to any particular neighborhood suggesting that urban socioeconomic features are not the primary drivers of exposure risk.ConclusionSocioeconomically disadvantaged neighborhoods remain hotspots for HIV/AIDS. We conclude that SOM is an effective tool in the identification of risk clusters that may guide public health efforts to optimize HIV prevention and testing in French Guiana and other developing countries.


Background
The residential environment, particularly disadvantaged neighborhoods, has been shown to affect individual behavior and health through direct or indirect socioeconomic processes (1).Neighborhoods have emerged as a determinant of public health providing both objective physical infrastructure and social characteristics that influence individual and community health (2).Exploring neighborhood effects is thus important as it looks at community-based interventions to promote population health and health equity in regions with racial/ethnic minorities.
Characteristics of disadvantaged neighborhoods often correlate with individual HIV risk behavior (3).High school dropout rates, low employment rate, substance and alcohol abuse, increased street violence usually correlate with greater HIV risk (4).Furthermore, neighborhoods with such features clustering exhibited higher HIV mortality and delayed anti-retroviral therapy (ART) initiation (3).
Histoplasmosis is one of the most frequent, often overlooked endemic disseminated mycosis in HIV patients in the Americas (5,6).The disease caused by the inhalation of the microconidia and mycelial forms of a fungus, Histoplasma capsulatum var.capsulatum (HC), found in guano-enriched soils.In French Guiana (FG), histoplasmosis is the most common AIDS-defining event and the leading cause of AIDs related deaths (7), with 75% of HIV-infected persons being foreign citizens.The fight against HIV has struggled to reach the undiagnosed reservoir with 30% of patients in Cayenne and 50% in Saint Laurent du Maroni, the two major cities of FG, diagnosed with advanced HIV-disease, with certain population groups particularly at risk of late testing.Although individual epidemiologic risk factors are important, spatialized community approaches to HIV programs seem to have an operational advantage.We thus tested the hypotheses that the distribution of HIV/AIDS and that of AIDS-related histoplasmosis varied between neighborhoods using self-organized maps, a two-dimensional data visualization tool that is trained using an unsupervised process.

Study Settings
FG is a French overseas territory located in the intertropical zone of South America.In January 2015, France rolled out "le Quartier Prioritaire de la Politique de la Ville (QPPV)" or priority district of the city's policy to identify and improve the socially disadvantaged areas in France.Neighborhoods were identified by the poverty rate defined by INSEE as the proportion of the population living in regions that are under 60% of the median metropolitan living standards.In FG, the QPPV identified 32 priority neighborhoods in the urban, suburban, and peri-urban regions.In our study, we refer to the QPPV neighborhoods as urban disadvantaged neighborhoods.
In FG, all patients diagnosed with HIV receive free ART regardless of their socioeconomic status or country of origin.HIV care is accessible and is comparable to mainland France.

Patient Data
HIV-positive patients in follow-up at the hospitals of major cities in FG between April 1995 and May 2017 were enrolled in the French Hospital Database for HIV (FHDH).Western Blot was used to confirm the HIV status of all patients enrolled in the study.

Self-organizing maps (SOMs)
We used self-organizing maps to identify clusters who would benefit most from public health measures.SOM are a type of artificial neural network that consists of an array of units, called nodes, arranged in a fixed position on a grid.The key feature of the SOM is that the topology of the original input data is conserved, i.e. similar variables are grouped together on the map.We trained the SOM model of 2120 nodes and variables (n = 9) with 2000 iterations at a learning rate between 0.05-0.01 to reach a minimum plateau.The nodes were modeled on a 20 × 20 grid with a hexagonal topology.
Each node represented an HIV patient and his/her neighborhood with its position fixed in the grid.The variables included age, sex, CD4 counts at Nadir, neighborhood type, CDC classification of HIV, histoplasmosis/hepatitis positivity, unemployment rate and neighborhood security.All SOM statistics were performed using the 'kohonen' package in R version 3.6.1.

Ethics
All patients enrolled in the FHDH gave written informed consent to the use of data for research.The data is anonymized and encrypted before transfer to the Ministry of health and the Institut National de la Recherche Médicale (INSERM), which centralize data from Regional Coordination for the fight against HIV (COREVIH) across France.This cohort has been ethically approved by the Commission Nationale Informatique et Libertés (CNIL) since 1992 and has been used for numerous publications.

Sociodemographic characteristics
We recruited a sample population of 2141 patients diagnosed with HIV between 1997 and 2017 in FG.
The sample consisted of 1101 (51.4%) male and 1040 (48.6) female patients with an average age of

SOM model and clusters
The heatmaps illustrate the distribution of seven variables in the sample population (Fig. 1A-G).We used hierarchical agglomerative clustering (HAC) to determine the optimal number of clusters in the sample population (Fig. 2).The demographic and socio-economic clusters in the HIV-related histoplasmosis population in FG is visualized in Fig. 1H.We observed that the largest cluster (Fig. 1H; blue; n = 1365) included PHLA) with a low employment rate living in urban disadvantaged neighborhoods and rural areas.The cluster also included regions where crime and insecurity are higher than in richer neighborhoods.The significantly smaller second cluster (Fig. 1H; green,) was represented by cases from favorable suburbs with a higher employment rate and relatively higher proportion of cases belonging to CDC class A. The histoplasmosis (Fig. 1H; purple; n = 113) and hepatitis (red; n = 141) cases were grouped in separate clusters.The hepatitis and histoplasmosis clusters had significantly higher males (P < 0.0001), increased mortality (P = 0.009, P = 0.002), and lower CD4 counts at nadir (P = 0.008, P < 0.0001), respectively.The descriptive statistics are detailed in Supp.Table .1.The smallest cluster (Fig. 1H; orange; n = 109) were cases with higher CD4 counts at nadir.

Discussion
Here, for the first time, we mapped the neighborhood effect on HIV/AIDS using self-organizing maps and showed that the spatial boundaries of HIV/AIDS distribution follow those of neighborhoods.Our second hypothesis that HIV-related histoplasmosis risk differed by neighborhood was rejected as the risk of developing histoplasmosis in HIV/AIDS was similar in both the socioeconomically privileged and underprivileged neighborhoods.
We found unemployment to have a higher structural impact than the other tested variables on the identified clusters.Studies demonstrate that unemployed HIV patients had an increased risk of HIV mortality and disease progression than those with stable employment in the HAART era (8).This coincides with our results, which showed that HIV-infected persons in economically poorer neighborhoods had higher odds of progressing to AIDS.Unemployed individuals have lower access to -and/or-underutilize preventive healthcare services than their employed counterparts (9), which potentially delays HIV diagnosis and treatment.Furthermore, economic insecurity amongst youth, particularly women, has been shown to increase the HIV burden in developing countries including FG (10,11).
Our results showed that neighborhood insecurity formed a sizable portion of the largest cluster.
Research has shown that neighborhood crime was associated with HIV risk taking behaviors such as unprotected sex and multiple sexual partnerships (11,12).Furthermore, a recent study observed that HIV-related mortality reached over 8% of released prison inmates in FG (13).
UNAIDS aims for 90% of HIV-infected persons aware of their diagnosis, 90% of those diagnosed on antiretroviral treatment and 90% of these virologically suppressed by 2020.However, the first 90% seems hard to reach in most countries and notably in FG where the proportion of patients diagnosed each year with advanced HIV disease remains stable despite efforts to scale up and diversify HIVtesting.The present approach may bring strategic and operational insights to improve the capacity to reach the hidden reservoir of undiagnosed infections (14).
We observed that the risk of histoplasmosis in PLHA was similar in urban neighborhoods and rural regions.Thus, it seemed that the incidence of histoplasmosis in FG was ubiquitous, influenced by the environmental distribution and endemicity of spores rather than socioeconomic factors or microenvironmental nuances.In our study, histoplasmosis, the most frequent AIDS-defining infection in FG, was a significant cluster in PLHA (15).Our results on the histoplasmosis cluster is consistent with previous studies from FG (7), where the risk factors include male sex and < 50 cells/mm 3 CD4 at nadir.We also observed higher mortality in histoplasmosis and hepatitis clusters, which demonstrates the significance of cluster identification for community-level management of HIV/AIDS.We did not analyze the environmental factors influencing the distribution of histoplasmosis, providing thus scope for future studies to analyze the impact of soil acidity and rainfall on disease incidence.In addition, the lack of socioeconomic variables, such as built environment, is another limitation of our study.

Conclusions
We conclude that SOMs applied to HIV/AIDS cases are an effective tool to identify and prioritize clusters in a large dataset.Whilst unemployment and neighborhood insecurity have a significant impact on HIV risk and socioeconomically disadvantaged neighborhoods remain hotspots for HIV/AIDS, histoplasmosis was found dependent on the environmental distribution of the pathogen's spores.Public health efforts should prioritize disadvantaged neighborhoods for HIV/STI awareness, and screening and management of hepatitis and histoplasmosis.
AbbreviationsAIDS Acquired Immunodeficiency Syndrome ART Anti-Retroviral Therapy CDC Center for Disease Control FG French Guiana Declarations Ethics approval and Consent to participate All patients enrolled in the FHDH gave written informed consent to the use of data for research.The data is anonymized and encrypted before transfer to the Ministry of health and the Institut National de la Recherche Médicale (INSERM), which centralize data from Regional Coordination for the fight against HIV (COREVIH) across France.This cohort has been ethically approved by the Commission Nationale Informatique et Libertés (CNIL), the regulatory authority, in 1992 and has been used for numerous publications.

Figures 11 Figure 1
Figures Self-organizing maps (SOM) model of 2120 nodes on a 20x20 grid.The spectrum of colors represented in the y-axis represents the scale of the map.A) demonstrates the structure of HIV/AIDS population where the patient living in urban disadvantaged neighborhoods are illustrated (in red), in rural (yellow-green spectrum) and favorable urban neighborhoods (blue); B-E) illustrate the spectrum of severity of HIV/AIDS; F & G) illustrate the neighborhood unemployment rate and insecurity; H) represents the hierarchical agglomerative clustering and segmentation on the SOM model.

Figure 2 Cluster
Figure 2 Cluster dendrogram to determine the optimal number of clusters in the SOM model.The colors correspond to the colors of the SOM model.