Tuberculosis forecasting and temporal trends by gender and age in a high endemic city of Northeast Brazil: Are we achieving the disease elimination stage?

Objective: to describe the temporal trend of tuberculosis cases according to gender and age group and to make forecasts in an endemic municipality of northeast Brazil. Method: This was a Time Series study, carried out in a municipality in the northeast of Brazil. Population was composed of tuberculosis cases among residents of the municipality, reported between the years 2002 and 2018. An exploratory analysis of the monthly rates of tuberculosis detection, smoothed according to gender and age group, was performed. Subsequently, the progression of the trend and predictions of the disease were also characterized according to these aspects. For the trends forecast, the seasonal autoregressive linear integrated moving average – Seasonal ARIMA model and the usual Box-Jenkins method were used to choose the most appropriate models. Results: A total of 1,620 cases of tuberculosis were reported, with an incidence of 49.7 cases per 100,000 inhabitants in men and 34.0 per 100,000 in women. Regarding the incidence for both genders, there was a decreasing trend, which was similar for age. Evidence resulting from the application of the time series shows a decreasing trend between the years 2002–2018, however, it is unlikely that there will be a signicant fall in the disease before 2022.


Introduction
Tuberculosis has plagued humanity for approximately 8,000 years and is seen as a public health problem. Due to its relationship with poverty and unequal income distribution, aggravated by the current context of the COVID-19 pandemic around the world, the risk of becoming ill from tuberculosis can increase in populations in situations of social vulnerability [1]. Brazil is the main propagator of the disease in the Americas, contributing substantially to the number of new cases, recurrences and/or reinfection. Among the 77,000 new cases of the disease diagnosed in the country, approximately 4,500 had death due to tuberculosis as their outcome [2].
The probability that an individual will be infected and develop the disease depends on several factors, among them the social determinants of health and the social inequalities that affect Latin America and therefore, the country [3,4]. Brazil ranks 20 th in terms of disease burden and 19 th in terms of tuberculosis/HIV co-infection. This scenario encouraged the development of the National Plan for the End of Tuberculosis as a Public Health Problem, which aims to end tuberculosis as a public health problem in Brazil, with the goal of less than 10 cases per 100,000 inhabitants, by the year 2035. The country is among those with greater social inequality, which makes the elimination of tuberculosis a challenge [5,6].
There is also evidence that the country was progressively improving towards the goals of eliminating the disease, in accordance with the End TB strategy. However, there is a lack of knowledge about how this happened in sub-national units, particularly in municipalities that are highly endemic for tuberculosis.
There have been few investigations regarding the trends and temporal progression of tuberculosis, taking into account the gender and age of individuals affected by the disease. There are hypotheses that the incidence of tuberculosis is more prevalent in men, when compared to women [7]. Considering the limited knowledge in the literature, the study aimed to describe the temporal trend of tuberculosis cases according to gender and age group and make forecasts considering the current context in an endemic municipality in northeast Brazil [7].

Study type and scenario
This was an ecological, time series [8] study, conducted in the municipality of Imperatriz, located 626 km from the capital of Maranhão state, São Luís, being the second largest city in the state and the 23 rd largest city in northeastern Brazil [9].

Study population and information sources
The study population consisted of tuberculosis cases in the municipality of Imperatriz-MA reported in the Noti able Disease Information System (Sistema de Informação de Agravos de Noti cação -SINAN) from 2002 to 2018. It should be highlighted that SINAN is a Brazilian information system responsible for registering and processing information about noti able diseases across the country [10].

Study variables
The variables selected include the date of noti cation of the cases, age in years and gender (male or female). The data were collected at the health surveillance service of the Regional Management Unit of the city, government of the state of Maranhão.

Data analysis
Monthly time series of tuberculosis cases were initially constructed, considering the period from January 2002 to December 2018.
In the time series construction process, the calendar adjustment technique was applied, taking into account the number of days in each month in the subsequent calculations, aiming to improve the representation of the series over the study period. After, the general detection rate was calculated, strati ed by gender (male and female) and age groups (<15 years; 15 to 59 years, and ≥60 years). For the calculation of the general detection rate, the total population of the municipality was considered, and for strati ed rates, the population of men and women with their respective age ranges was considered, using a multiplication factor per 100,000 inhabitants. The resident population used as the denominator was from the year 2010, the date of the most recent census (count) of the Brazilian population. The tuberculosis detection rates were smoothed using the moving average technique, considering the arithmetic mean of the three months (previous, current and subsequent).
An exploratory analysis of the tuberculosis detection monthly rates (smoothed and corrected through calendar adjustment) was carried out according to gender and age group. Subsequently, the Seasonal Trend Decomposition using Loess (STL) method was applied to remove the time series components [11,12]. Accordingly, it is considered that at each point of time the time series X t occurs through the sum of three components: seasonality (S t ), trend (T t ) and noise (Z t ). After applying the STL method, trends in the general detection rate were selected, strati ed by gender and age group.
For modeling the monthly detection rates by gender, as well as forecasting the respective trends, autoregressive integrated moving average (Seasonal ARIMA) models and the Box-Jenkins method were used to choose the appropriate models based on the data structure [13]. The Seasonal ARIMA model -ARIMA (p, d, q) (P, D, Q) S -allows the variability of processes related to time, linearity, stationarity (d = D = 0) or non-stationarity (otherwise) to be described and is written as follows: respectively, the autoregressive and moving average polynomials of the non-seasonal part and, respectively, autoregressive and moving average polynomials of the seasonal part of the S period. This is the transformation to stabilize, if necessary, the variance (usually called the Box-Cox transformation), while Z t represents the white noise process (uncorrelated process, zero mean and constant variance).
The letters p and q represent, respectively, the number of parameters of the autoregressive parts and the moving average parts, with the seasonal period of S lengths, and letters P and Q being the equivalent number of these parameters between the seasonal periods. The letters d and D, respectively, represent degrees of simple differentiation and the seasonal differentiation necessary to transform a nonstationary series into a stationary one [14].
For the validation of the model, speci cally in the analysis of residuals, the absence of autocorrelation (Portmanteau tests: Ljung-Box and Box-Pierce), randomness (Rank and Turning Point tests), and normality (Kolmogorov-Smirnov test) tests were applied and the t-test for zero mean. Whenever more than one model was appropriate, the choice of the best model was made considering the principle of parsimony and the lowest Akaike information criterion (AIC) and Bayesian information criterion (BIC) values.
To assess the predictive performance, the following measures were considered: Root Mean Square Error (RMSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE), which allow the accuracy of estimates or forecasts to be assessed. According to their criteria, the most appropriate model is the one with the lowest error values [14]. Subsequently, data forecasts and trends for the quadrennium (2019 to 2022) were made. The method proposed by Box and Jenkins consists of an interactive process composed of ve steps: stationarization of time series; identi cation of the model and respective orders; estimation of parameters; validation of the model and forecast of future values [12][13][14]. All analyses were performed using the RStudio version 3.5.2 software (https://rstudio.com).

Ethical aspects
The study was approved by the Research Ethics Committee of the University of São Paulo at Ribeirão Preto College of Nursing (EERP / USP), under protocol number 3.178.950 of 03/01/2019, following the ethical recommendations of the National Health Council, in accordance with Resolution 466/12. The study did not need subject consent as secondary data was used.

Results
In the period from 2002 to 2018, 1,620 cases of tuberculosis were identi ed. Of this total, an incidence of 49.7 cases per 100,000 inhabitants was observed in men and 34.0 cases per 100,000 inhabitants in women.
Considering the age group below 15 years of age, there was an incidence of 7.76 cases per 100,000 inhabitants; incidence of 6.5 cases per 100,000 inhabitants in male children and 9.0 cases per 100,000 inhabitants in female children. Regarding the age group between 15 to 59 years of age, there was an incidence of 57.0 cases per 100,000 inhabitants for men and 34.4 cases per 100,000 inhabitants for women. While the age group over 59 years of age presented an incidence of 138.0 cases per 100,000 inhabitants for men and 100.2 cases per 100,000 inhabitants for women (Table 1). Table 1 also shows that most of tuberculosis cases in the period studied, were in men, in the 15 to 59 years age group. There was also a high incidence of tuberculosis cases in women aged 15 to 59 years. The lowest values referring to tuberculosis cases were registered in the age group below 15 years, for both genders. Regarding the time series trends, the general incidence of tuberculosis ( Figure 1A), presented a decreasing tendency. According to these analyses, peaks of cases were identi ed over the entire period, however, there were declines in these cases, especially in the years 2003, 2008, 2013, 2015, 2017 and 2019.
In relation to the 15 and 59 years age group, for both genders, ( Figure 1B and 1D) there was a decreasing trend in tuberculosis cases over the time series, however, high detection rates of cases among women were observed.
Considering the cases of tuberculosis in the over 59 years age group, for both genders, the ndings reveal that in the initial years of the time series there were high rates of cases for both genders, and that, in general, there was a decreasing trend in these cases ( Figure 1C and 1E) It was not possible to estimate the trends in tuberculosis cases in the age group below 15 years, for either gender, as there was an excessive presence of zeros (in ated by zero). The temporal modeling of tuberculosis detection according to gender presented a decreasing trend, revealing that the time series were not stationary. Therefore, Box-Cox transformations were performed to stabilize the variances and means, transforming non-stationary series into stationary ones. Through the analysis of the autocorrelation functions (ACF) and partial autocorrelation functions (PACF), some candidate models were chosen and their parameters were estimated.
In addition, Box-Cox transformations were performed to stabilize the variance and simple differentiations to stabilize the mean. After verifying the signi cance of the parameters of the models and considering the lowest AIC and BIC values, the most appropriate models in terms of the ability to describe the variability of the data over time, as well as those that performed well in the forecasts were: ARIMA (3,1,3)(2,0,1) [12] with drift for the mean of tuberculosis case detection in men and ARIMA (5,1,4)(0,0,2) [12] for the mean of case detection among women ( Table 2).  This adjustment was made for cases of tuberculosis in males and females ( Figure 2).
In relation to the forecast for tuberculosis cases in men, gure 2A indicates an expectation of a decreasing trend in cases, at the beginning of the study forecast period, that is, 2019, followed by slight increases in cases in 2020 and 2021, with a slight decreasing trend followed by stabilization in the detection rates after that year until the end of the forecasts.
Regarding the forecasts for tuberculosis cases in women, gure 2B indicates a slight increase in cases at the beginning of the forecast, followed by a trend of stabilization in cases from 2020 until the end of the forecasts. Both graphs show that the general expectation for tuberculosis cases is the probable stabilization in cases in the city of Imperatriz, with some slight increases in cases in the initial forecast period, that is, 2019 and 2020, especially for women, however, after this period there is an expectation of a decrease and stabilization in cases, although the incidence will remain high.

Discussion
The study aimed to describe the temporal trend of tuberculosis cases according to gender and age group and to make predictions considering the current context in an endemic municipality of northeast Brazil.
The study also showed a curious fact, that is, although tuberculosis most commonly affects men of the economically active and older adult age groups, there was a high incidence of the disease in women in the 15 to 59 years age group, an unusual phenomenon, compared with other studies carried out in Brazil and internationally [15][16][17][18][19][20].
The majority of the tuberculosis cases occurred in men, especially in the 15 to 59 years age group, that is, people classi ed as economically active. This may have a negative impact on their lives and those of their families, as the disease in these people may result in removal from their workplaces, which may compromise their income and/or that of their family and contribute to the emergence or worsening of poverty [9,[21][22][23]. These data indicate a warning situation, as they may signify a high transmission of the disease in the population.
This can also be associated with delays in diagnosis, social factors that make diagnosis and control di cult and with areas lacking su cient screening measures [24]. Furthermore, this high incidence among men may be related to behavioral factors, such as the fact that they do not frequently seek medical attention when symptoms appear, as well as operational issues related to di culties in accessing health services in a timely manner due to incompatibility between men's working hours and those of health facilities, lack of a health policy directed toward men, and restricted access to health information [20].
In addition to these issues, the in uence of socioeconomic and cultural factors, include the consumption of alcohol, tobacco and other drugs and having Diabetes mellitus or lung cancer, which are known risk factors for tuberculosis and are more common in the male population [24,25].
There also was a high incidence of tuberculosis in the female population in Imperatriz, especially in the 15 to 59 years age group (Figure 1F), demonstrating a feminization of tuberculosis, a phenomenon that is particularly present in the North and Northeast regions of Brazil [26]. This phenomenon may be related, not to the fact that they have di culty in accessing health services, but, contrary to what is veri ed in the male population, they are more likely to abandon treatment [27,28]. Another possible explanation for this high incidence of tuberculosis, especially in the city of Imperatriz, concerns the educational level or the lack of knowledge of this population about tuberculosis, especially in relation to the symptoms, diagnosis of the disease and treatment.
Regarding education, according to data from DATASUS, referring to the female population of Imperatriz, in the 15 to 59 years age group, 22.7% were classi ed as uneducated/incomplete 1 st fundamental cycle; 20.0% had complete 1 st fundamental cycle/incomplete 2 nd fundamental cycle; while 57.3% reported having completed the 2 nd fundamental cycle or more [28].
There is also an association of high incidences of tuberculosis in the female population with the processes of autonomy and decision-making; the domestic work burden, postponement of seeking healthcare; low levels of education; high unemployment rates; informal work; low income and/or residing in rural areas, where distances make access to diagnostic and treatment services di cult, and lead to a higher proportion abandoning treatment [27,29,[30][31][32][33]. Despite the fact that tuberculosis treatment is offered free of charge in Brazil, through the Brazilian Nation Health System (Sistema Único de Saúde -SUS), nancial resources are often needed to get to the health care units, as well as expenses with food and lost working days, which make it untenable to continue treatment [34,35]. Furthermore, women working in the informal sector need to work long hours to earn their income, not having time to respond to their health needs in a timely manner and being more likely to visit health facilities only when they are seriously ill [26].Regarding the cases of tuberculosis in the population over 59 years of age in the city of Imperatriz, the present study showed a high incidence in both genders, especially in the initial years of the study ( Figure 1D and 1G). This result is in line with other studies that found that older adults are more susceptible to falling ill, since they present a decline in immunity, as well as having other comorbidities [16,26,33,36,37].
The results of the study also showed, especially in the trend graphs (Figure 1), that there was a decrease in the cases of tuberculosis in the years 2003, 2008, 2013, 2015, 2017 and 2019, demonstrating a pronounced decrease in cases. This reduction in cases is consistent with the temporal trend of tuberculosis in Brazil, showing a fall in its incidence in the country's geopolitical regions [26,38,39].
The reduction in tuberculosis cases in the early 2000s in the study scenario may be a re ection of the actions implemented in the National Plan for the Control of Tuberculosis (Plano Nacional para Controle da Tuberculose -PNCT), created in 1998, in which the program coverage was extended to 100% of the municipalities, with directly observed treatment (DOT) [40,41). In addition, this plan also aimed to integrate tuberculosis control with primary care, including the Community Health Agents Program (Programa de Agentes Comunitários de Saúde -PACS) and the Family Health Program (Programa de Saúde da Família -PSF), to ensure effective expansion of access to diagnosis and treatment [41].
In 2009, there were changes in national policies regarding active case nding, monitoring and treatment of tuberculosis in Brazil, which resulted in the reduction of new cases [39].  [42] and the implementation of Active Case Finding for Respiratory Symptomatic Patients; the National Tuberculosis Control Program in 2017 [5] and implementation of the National Plan for the End of Tuberculosis as a Public Health Problem, having, among other goals, the aims of ending tuberculosis as a public health problem by 2035 [39].
Considering the forecasts, despite the decrease in the detection of the disease, the municipality under study will continue to present cases of tuberculosis, according to the forecasts for the quadrennium 2019 to 2022, as the forecasts indicate a curve with a stationary trend. This, in a way, imposes challenges for public managers with regard to more effective and e cient strategies of active case nding and directly observed treatment, as well as improvements in the laboratory network for quality diagnoses with the results delivered rapidly and treatment started as early as possible. From this perspective, the ndings of this study lead to a discussion on the behavior of the disease in the context of the scenario studied, revealing that the trends found in this study period and in the forecasts, indicate that the male and female population, in the 15 to 59 years age group will still be the most affected by this disease.
Regarding the temporal modeling stage, the ARIMA models selected for the total detection and rates according to gender presented adequate ts, providing e cacy in the models to capture the data dependency structure, that is, they effectively described the variability of the detection rates over time.
Limitations of the study related to the characteristic bias of ecological studies should be highlighted, in which the ndings of this investigation cannot be inferred on a case by case basis, being only representative for the populations. Furthermore, the acquisition of information through secondary data can lead to errors or aws inherent in the noti cation or recording of data and possible bias in the investigation, such as, for example, underreporting, with the potential presence of ignored or incomplete data.
It should be highlighted that the pandemic scenario was not considered, which may in uence the incidence of tuberculosis in the coming years, however, as in other aspects, there are no concrete a rmations that can consider the speci cities of the Brazilian territory. In general, the ndings showed the problem of tuberculosis in the city of Imperatriz, over the years, where, prior to this study, statistical resources of temporal analysis had not been used as a tool to identify this problem.
It is believed that the evidence gathered in the study can contribute to tuberculosis control actions in the municipality, through the public agents, indicating measures to improve the population's health, as well as being an instrument for articulation with the FHS teams, guiding their health promotion actions in the municipality.
In conclusion, the study showed that men and women in the socioeconomically active age group are those most commonly affected by tuberculosis. The study demonstrated a decreasing trend in tuberculosis, however, at a rate below WHO expectations, and, according to the forecasts, TB will remain at levels of hyperendemicity until 2022, which indicates that the elimination of the disease is more distant than it appears.

Declarations
Competing interest: No