Sociodemographic Factors Associated With Knowledge of Type 2 Diabetes in Rural Tamil Nadu, India: a Cross-sectional Study

Background: This study aimed to investigate the overall awareness of type 2 diabetes as well as how sociodemographic factors inuence diabetes knowledge. With India having the second highest prevalence of diabetes globally, it is increasingly important to assess how diabetes can be addressed in rural Indian populations. Methods: Systematic random sampling was used to gather study participants in 17 villages within the Krishnagiri district of Tamil Nadu, India. Associations between sociodemographic factors and composite diabetes knowledge score were assessed using a multinomial logistic gllamm model. Results: A total of 753 individuals participated in the study. Overall knowledge on diabetes was low, with 66% of individuals having no knowledge regarding diabetes. Achieving a moderate knowledge score was signicantly positively associated with education, wealth, participation in MGNREGA (Mahatma Gandhi National Rural Employment Guarantee Act), and business ownership as a source of income. Achieving a good knowledge score was signicantly positively associated with education, wealth, rurality, participation in MGNREGA, business ownership as a source of income, as well as frequency of healthcare typically received. Rurality was signicantly negatively associated [Relative Risk Ratio (95% CI)] with both moderate knowledge score [0.34 (0.19, 0.59)], and good knowledge score [0.43 (0.24, 0.74)]. The strongest predictor of having a good knowledge score was having a high school graduate or post-secondary education [11.07 (4.44, 27.61)]. Enrolment in MGNREGA employment was the strongest predictor for having a moderate knowledge score [3.27 (1.93, 5.54)], as well as strongly associated with having a good knowledge score [2.39 (1.31, 4.36)]. Conclusion: The low awareness of diabetes seen in this study raises serious concerns for public health in India. Public health efforts must prioritize health equity to lessen the impacts of diabetes in rural populations, where individuals face systemic barriers to receiving prevention and treatment for conditions such as diabetes.


Introduction
Type 2 diabetes mellitus (hereafter referred to as diabetes) is a disease of global health concern. The International Diabetes Federation

Ethics and Consent
The research team obtained clearance for the study from a Canadian institutional ethics review board. Permission for the study was granted by the High Commission of India in Ottawa, Canada. Upon arrival to the research site and prior to the recruitment process, researchers sought and received permission for the study from local authorities (panchayat councils, local police o cials, and hospital medical staff). Informed verbal consent was obtained from all research participants prior to enrollment in the study.

Study Design and Data Collection
Speci c sampling and data collection methods for this cross-sectional study are described elsewhere [9]. In brief, the research team conducted systematic random sampling to recruit adult participants (19 years and older) from 17 villages in a rural region of Krishnagiri District, Tamil Nadu. Following recruitment and informed consent, a survey was administered to participants by a trained researcher to collect information on demographics, occupation and livelihood characteristics, self-reported health, and household assets. Knowledge on diabetes was collected using a validated questionnaire developed by the Madras Diabetes Research Foundation [19].

Variable De nitions and Explanations
Although all villages included in the study were rural by the de nition as described by the Census of India, the rurality of each village was assessed as a predictor variable using a rurality index (RI), adapted from Weinert and Boik [20]. The two variables incorporated into the RI were distance to the primary healthcare centre in kilometers (given half a positive weighted value) and the population size of each village (given a full negative weighted value). The results were standardized to a mean of zero and standard deviation of one, with a positive score indicating a more rural residence, and a negative score re ecting a less rural residence. In the case of this study, the 17 villages ranged in size from 30 households-1200 households per village.
Data on household assets and community facilities (such as toilet facilities and water pumps, if applicable) were collected to assess each participant's SES. These data were collected using an adapted questionnaire from the second National Health and Family Survey (NFHS-2), originally consisting of 29 questions to create a Standard of Living Index [21]. Those questions relevant to the study population were used, with a total of 13 weighted questions for a maximum score of 26. Weights of items such as type of housing, community or household water and toilet facilities, household possession of TVs/radios, and ownership of livestock, were developed by the International Institute of Population Sciences in India and based on a priori knowledge of the signi cance of each indicator in determining household SES position [22].
We collected additional demographic information from each participant, including religion, caste, and sources of income in the last year. Behavioural information such as time per day spent watching TV, as well as type and frequency per month or per year of healthcare usually accessed, was also collected. Religion was assessed as a binary variable (Hindu or 'other'), and caste was categorized into low caste (scheduled tribes and caste), lower-middle caste (backwards castes, most backwards castes, other backwards castes), middle-upper caste (general category), and high caste (Brahmin caste). Data on occupation were collected by asking about sources of income within the last year. These sources included local labour, agriculture, livestock, migrant work, merchant work, shop/small business, government schemes, or MGNREGA (Mahatma Gandhi National Rural Employment Guarantee Act). For ease of interpretation, livelihood information was categorized into local labour, farming (agriculture and/or livestock), migrant work, business (merchant and/or small business owner), or government funds (MGNREGA and/or other government schemes), and assessed as binary variables (yes or no for each individual source of income). Time per day spent watching TV was collapsed into four categories: less than 0.5 hours per day, between 0.5 hours and 1 hour per day, more than 1 hour up to 2 hours per day, or more than 2 hours per day.
Age was categorized into four groups, speci cally: 20-34, 35-49, 50-64, and 65+, based on similar methods used by Shrivastava et al. [16] & Murugesan et al. [13]. Type of healthcare typically accessed included government, private, natural (i.e. ayurvedic and/or other alternative medicines), or none, and each were assessed as binary variables. Frequency of healthcare visits was collapsed into three categories, including: once a month or more, less than once a month but more than once a year, and once a year or less. School grade achieved was used to assess education, ranging from 0 to 15, with anything above 12 indicating a post-secondary education. For ease of interpretation, this variable was categorized into 'no schooling', 'primary education' (grades 1-8), 'secondary education' (grades 9-11), and 'graduate or post-secondary education' (grades 12+), on the basis of similar methods used previously among Indian populations [13,19].
Knowledge on diabetes was collected using a validated questionnaire developed by Mohan et al., for the Chennai Urban Rural Epidemiology Study. The questionnaire and weights for each question are explained in detail elsewhere [19], and included questions about diabetes such as risk factors, complications, and prevention, and were used to calculate a composite knowledge score ranging from 0 to 8. The rst question, "Do you know what diabetes is?" acted as a screening question such that those who answered "no" were automatically given a score of 0 and did not answer the remaining questions.

Statistical Analysis
Data were cleaned using Microsoft Excel. All statistical analyses were conducted using Stata IC 16.1. Due to zero in ation and heteroscedasticity of residuals, a linear regression model was not appropriate to model the composite knowledge score as a continuous variable. The diabetes knowledge score was categorized as: a score of 0; scores of 1-4 (moderate knowledge); and scores of 5-8 (good knowledge).
Descriptive analyses were rst conducted to establish the sociodemographic characteristics of the study population, overall diabetes knowledge scores, knowledge scores broken down by education level, and the proportion of correctly answered questions from the diabetes knowledge survey. Following this, we tested collinearity by calculating the intraclass correlation coe cient (ICC). An ICC higher than 0.8 was the cut-off point used to determine collinearity between variables. The linearity assumption was also tested against continuous predictor variables and the log-likelihood of the outcome. Since a multinomial regression was used, each predictor was tested against binary categories of the outcome using the log-likelihood (i.e., knowledge score in category 1 versus 0, 2 versus 0, and 1 versus 2). A lowess curve was rst used to assess this assumption, followed by the inclusion of a quadratic term. Variables that proved to be non-linear and could not be modelled with a quadratic term were then categorized.
To control for confounding and perfect prediction of the outcome based on diabetes diagnosis, all individuals with self-reported diagnosed diabetes were omitted from the data analysis. All sociodemographic factors were then t for univariable analysis with the outcome, using 0 as the referent outcome category. These variables included age, sex, wealth index, education, religion, caste, methods of earning income, rural index, TV exposure, and type and frequency of healthcare usually received. Only independent variables signi cant to a liberal p-value of 0.2 were included in the initial multivariable model. A multinomial regression model was used to assess associations between sociodemographic factors and knowledge score categories, with a knowledge score of 0 as the referent. Due to clustering of the data by village, village was added to the model as a random effect variable. To incorporate the use of a random effect in a multinomial model, the Stata program gllamm was used to t a generalized linear latent and mixed model (GLLAMM), using village as a discrete random effect, and knowledge score as a polytomous outcome with the mlogit link. Adaptive quadrature was also used to ensure the most precise estimates were given for the log-likelihood of the two-level model [23].
A manual backward elimination method was used to t the nal model. Each variable was removed independently from the full model and the partial model was then tested for signi cance (p < 0.05), using likelihood-ratio tests and Wald tests. As each variable was removed from the model, it was simultaneously tested for confounding by investigating if any coe cients of interest changed by 20% or more. A causal diagram created before analysis was used to identify which variables could act as confounders in the full model. All plausible interactions were then generated and assessed for signi cance using a p-value of < 0.05. Once the multivariable model was nalized, we performed diagnostics for gllamm using the gllapred command [23]. Upper residuals (empirical Bayes predictions) were produced, alongside Pearson and deviance residuals. Upper residuals were tested for homoscedasticity and normality, and Pearson and deviance residuals were used to assess outliers and their impact on the model.

Results
The response rate among the 812 recruited individuals was 92.7%, with 753 individuals completing the diabetes knowledge survey (341 males and 412 females). Descriptive characteristics of the study participants are presented in Table 1. The average age of participants was 47 years (SD ± 14.7 years). Over half of the population (59.1%) had no formal schooling, with only 35.1% reporting full literacy (could read and write). Only about one-third of the population reported knowing about diabetes. More men reported having any knowledge of diabetes when compared to women (36.9% vs. 30.3%), however this difference was not signi cant. Those who reported being aware of diabetes were then invited to participate in the full diabetes knowledge survey. Full results of the diabetes knowledge questionnaire are reported in Table 2.  Of those who reported awareness of diabetes, 62.6% answered "yes" to the question, "Do you know if diabetes is increasing?". About half of the participants who were aware of diabetes believed that it is preventable. In total, only 16.4% and 17.0% achieved moderate and good knowledge scores, respectively. Interestingly, even among those with the highest level of education, 42.2% did not know what diabetes was. Among those with no education, only 9.0% were in the 'good knowledge' category. The breakdown of participants in each category of knowledge score by sex is available in Table 3, and by education level in Table 4.  were not signi cant through the maximum likelihood and Wald tests were removed before tting the nal model.
The nal multivariable associations are presented in Table 5. Wealth, education, rurality, frequency of healthcare visits, and business and government funds as a source of income, all had a signi cant association with moderate and/or good knowledge of diabetes in the nal model. Notably, education was strongly associated with a good knowledge score in the nal multivariable model. Speci cally, those who were in the highest education category (having graduated high school or had higher post-secondary education), were over 11 times more likely to have a good knowledge score over no knowledge, compared to those having no formal education as the referent [RRR 11.1, 95% CI 4.4, 27.6]. A gradual increase in relative risk ratios (RRRs) for having a good knowledge score was seen with each increasing category of education, i.e. primary (grade 1-8), secondary (grade 9-11), and high-school graduate or post-secondary (grade 12+), respectively (RRRs = 3.2, 8.4, 11.1); however, this increasing trend was not seen for having a moderate knowledge score (RRRs = 1.8, 3.2, 1.3). Besides education, there were positive associations between moderate knowledge score and wealth, ownership of a business, and government funds as a source of income. Similarly, there were positive associations between good knowledge score and wealth, ownership of a business, and government funds as a source of income. Having a good knowledge score was also positively associated with frequency of healthcare received, meaning those with less frequent healthcare visits had a lower relative risk of having a good knowledge score. Rurality was negatively associated with any knowledge of diabetes, indicating that those participants living in more rural locations had lower relative risk of having a moderate or good knowledge score.

Discussion
This cross-sectional survey of adults residing in rural areas of Tamil Nadu shows low knowledge of diabetes among the population. Of particular concern, even those sub-groups of participants with higher wealth and education demonstrated a lack of knowledge of diabetes. Among those in the highest education category, over 40% had a knowledge score of zero, meaning they answered 'no' to the question, "Do you know what diabetes is?". Such results indicate that even among the most educated individuals in the study population, general knowledge of diabetes is low. Although these ndings are consistent with other studies [12,14,17,19,24,25], this rural region of Tamil Nadu demonstrated one of the lowest levels of knowledge in India reported to date, with 66% of participants being unaware of diabetes altogether. A study conducted in another area of rural Tamil Nadu assessed knowledge and self-care practices among patients with diabetes and showed that even among those with diabetes, knowledge of the disease was low. One result from this study was that 49% of participants with diagnosed diabetes thought diabetes was curable [16]. A similar qualitative study looking deeper into themes regarding diabetes knowledge among individuals with diabetes, showed that 96% of participants answered, "I don't know" when asked "what happens inside your body when you get sick with diabetes?" [26]. The low awareness of diabetes shown in these studies constitutes a serious public health concern in South India.
Although the trend of low diabetes knowledge levels is consistent across most studies in rural India, awareness appears to vary across regions in South Asia. A study conducted with adults from both urban and rural regions of Punjab, Pakistan, found that 86% of respondents had heard of diabetes [27]. The higher levels of education and SES of this population could be an explanation for the heightened knowledge scores as compared to the present study. Another study conducted on a lower-middle class urban population in Chennai, the capital of the state of Tamil Nadu, showed that over 90% of the general population knew of diabetes [13]. Such ndings suggest that diabetes awareness and knowledge may be better in urban regions, where information, messaging, and resources for diabetes may be more accessible. The association between higher wealth, education, and lower rurality and diabetes knowledge is consistent with other studies in India [12,13,17], and other low-and middle-income countries, including Jordan [28], Bangladesh [29], Southeast Ethiopia [30], Oman [31], and Pakistan [27].
Doctor-patient interactions in healthcare settings are important opportunities for patient education, as shown by the strong association between frequency of healthcare visits and diabetes knowledge. This nding corresponds with previous research in the study region showing that patients with diabetes perceived doctors as the most important source of knowledge on diabetes [26]. However, this study adds to the body of evidence suggesting that healthcare professionals in rural regions of South India do not adequately educate patients on diabetes; indeed, previous research in the study site found that both public and private healthcare practitioners often failed to provide su cient education and support to patients with NCDs [26].
An interesting result was the association between source of income and knowledge score. We found that individuals working as a merchant or shop owner were more knowledgeable about diabetes, perhaps due to increased income and SES status, allowing for more exposure and access to accurate health information when compared to farmers or labourers. Individuals of higher SES may also be at increased risk of cardiometabolic diseases such as diabetes, perhaps increasing the likelihood that a healthcare professional would educate them on such topics, or that they would have gained more information from peers of similar status and risk of diabetes [32]. The nature of being a merchant and/or shop owner also creates opportunities for interaction with community members. Social networking could therefore offer opportunities for exchanging information on diabetes for these individuals, perhaps more so than for farmers or labourers. Similarly, rurality is associated with a lower risk of diabetes [9], meaning those in a more rural setting are less likely to be exposed or have peers and community members with diabetes, which could be a plausible explanation for increased rurality being associated with lower diabetes knowledge.
Obtaining government funds as a source of income (MGNREGA and similar government schemes) was strongly associated with increased knowledge of diabetes. To our knowledge, this association has not yet been established in any previous studies. The MGNREGA provides employment security for adults (over age 18) who apply to the program and reside in rural households throughout all districts of India [33]. The main goal of the MGNREGA program is to provide employment opportunities to applicants within a radius of ve kilometers of their home for at least 100 days in a year. Most jobs involve manual unskilled labour and pay minimum wage, however, some higher-skill jobs (e.g., project supervisor) are available with higher compensation [34]. MGNREGA creates work projects for members of rural communities involving different forms of labour, sometimes including the construction of permanent assets in participating communities, such as wells, roads, and bridges [33,35]. While critiques of MGNREGA exist [34,36,37,38,39,40], involvement in this program throughout rural communities has the potential to foster local economic and employment activity, improve household income security, as well as improve quality of life [34,35,41].
While this relationship between diabetes knowledge and MGNREGA involvement has not been previously explored, this nding begets a number of plausible hypotheses that bear further exploration. Household income security through MGNREGA participation has been associated with increased household expenditure on education and healthcare [34,35,42,43,44], possibly increasing healthcare access, and therefore exposure to diabetes knowledge. However, MGNREGA wages are often delayed and unpredictable, and may be insu cient on their own to sustain households [35,36,38,40,41,43]. Additionally, involvement in government programs such as MGNREGA may improve participation in other government or social welfare programs, although research in this area is lacking. One study in particular found that women involved in MGNREGA had a high awareness of other existing government schemes, with some even expressing concern for over-dependency on government bene ts and programs [35]. Engagement in several government or social welfare programs may have the potential to foster trust and improve uptake in other government services and sources of information.
In our case, it is possible that those participants involved in MGNREGA may be more likely to trust and seek information regarding diabetes. Lastly, MGNREGA can improve community cohesion and bonding among those involved in the program [35,41]. Involvement in MGNREGA work may provide a platform to discuss common issues and interests [35], possibly leading to community members discussing health issues of peers, such as diabetes. Overall, the relationship between MGNREGA and diabetes knowledge is unique and possible pathways of association must be explored further.
Many studies highlight understandings and perceptions of diabetes among Indian populations that occasionally con ict with biomedical models of diabetes. A common perception in India is that consuming excess sugar is a direct cause of diabetes [14,16,17,19,24]. Additionally, 'tension' or mental stress are often also cited as direct causes of diabetes [14,17], and herbal or religious remedies are often recognized as effective treatments for diabetes [12,14]. Such patterns are consistent with the present study, as the most common perceived risk factor of diabetes was consuming sweets (16.5% of the study population). Correspondingly, the local colloquial term for diabetes was translated as 'sugar disease' [26]. Mental stress was also reported as a risk factor by 5% of those who knew of diabetes, the same proportion who reported obesity as a risk factor. This exempli es how cultural and local understandings of health and disease (such as 'tension') may in uence perceptions of diabetes causation [14,26]. Evidence also indicates that for many individuals in South Asia, family and friends are a main source of information on diabetes [45,46]. This further perpetuates localized understandings of diabetes, grounded in experiences of individuals within social networks rather than information from health authorities.
The low number of individuals in this study who reported obesity as a risk factor to diabetes (1.7% of the total population) is particularly concerning, considering obesity is one of the strongest predictors of type 2 diabetes [4,8,9,10]. Such ndings correspond with a similar study investigating diabetes knowledge in a rural northeast Indian population, which found that only 40% of those who were overweight knew they had an increased risk of diabetes [15]. The views and information held by this population and other rural Indian populations may be in uenced by a variety of societal and systemic factors. Some studies suggest that overweight and obesity are perceived as 'healthy' in some sub-populations in rural India, especially among low-SES individuals, since overweight can be a sign of wealth and food security [45].
Further, the fragmented healthcare system that is currently in place in India, along with poor investments in public health initiatives and health education, limit access to reputable and relevant information regarding health and disease, especially for rural populations.
Many studies highlight the di culty of receiving care for simple health issues, often citing the unavailability of doctors, long wait times, high costs, and lack of healthcare coordination [47,48]. More speci cally, a previous study within the study site described barriers to accessing both public and private healthcare -for example, corruption and poor quality of care in public services and prohibitively high costs in private services [48]. Despite the greater expense, private healthcare was preferred over public healthcare for major health problems such as diabetes [48]. Regardless, both public and private healthcare facilities are likely inadequate in appropriately disseminating important information regarding diabetes. Along with poor infrastructure, accessing healthcare in rural India is affected by broader issues associated with poverty. Healthcare centres are often located in urban cities (thus requiring transportation), are focused on tertiary care, and only affordable to the urban a uent, with rural poor individuals being faced with limited healthcare options [49]. Many individuals in rural areas face nancial hardships and use their income to sustain daily living, often avoiding seeking healthcare unless for life-threatening conditions [47].
India is currently grappling with an epidemiological transition that is driving an increasing burden of NCDs such as diabetes [2]. As of yet, efforts towards diabetes prevention have been found to be unsatisfactory in India, especially in rural areas [49,50]. It is therefore crucial and timely to improve efforts and allocate resources to alleviate the burden of diabetes. The associations of sociodemographic factors with diabetes knowledge in this study highlight priority areas for targeting initial public health efforts in Tamil Nadu.
Speci cally, efforts should emphasize the dissemination of accurate knowledge of diabetes signs, symptoms, prevention, and treatment to rural and isolated regions where high proportions of the population lack formal education and seldomly interact with healthcare systems. Importantly, improved knowledge on diabetes has been associated with positive attitudes and better self-care practices towards diabetes treatment and prevention [16,28,30,31]. Thus, investing in stronger public health efforts to improve healthcare access, quality, and focus on non-communicable disease prevention and treatment, presents a crucial tool for lessening the severity and impacts of the diabetes epidemic in India. However, it should be noted that structural factors grounded in economic and political realities -for example, food environments, access to sustainable livelihoods, and availability of recreational opportunitiesare also crucial components to preventing and managing non-communicable diseases and must be incorporated into any regional or national strategy to prevent burdens of diabetes [51,52].
The use of systematic random sampling and the polytomous outcome used in modelling are strengths of this study. This study also examined rurality on a continuous scale instead of using a binary outcome to assess urban and rural residence, allowing for increased granularity in examining the relationship between rurality and diabetes knowledge. Using a culturally appropriate index to examine SES that takes into account many common assets for a rural Indian population, the wealth index allowed for a nuanced and accurate assessment of the relationship between SES and knowledge. Despite these strengths, this study had some limitations. Importantly, cross-sectional surveys are unable to establish causation. Since most sociodemographic data and knowledge on diabetes was selfreported, some data may be in uenced by misreporting or social desirability bias. The weaknesses of the diabetes knowledge questionnaire have been well documented [19], and include open-ended questions being inhibited by memory and recall bias, and close-ended questions possibly encouraging respondents to provide guesses instead of informed answers.

Conclusions
Our study contributes to the large body of research regarding diabetes knowledge in India and South Asia. Speci cally, this study sought to investigate the overall knowledge of diabetes in this rural population, as well as to examine the in uence of different sociodemographic factors on knowledge of diabetes. Overall, this study highlighted the low levels of knowledge regarding diabetes in this rural population in Tamil Nadu. We identi ed positive associations between knowledge score and wealth, education, MGNREGA and business ownership as an income, as well as frequency of healthcare received. Increasing rurality was negatively associated with knowledge score. The association between rural residency and lower knowledge is an important starting point for targeting initial public health efforts for diabetes. Similarly, the association between wealth and education can be used to target those who fail to receive basic knowledge through higher education or privileges of higher social status.
Given the high prevalence of diabetes as well as prediabetes in this speci c rural population [9], as well as in the Tamil Nadu state in general [7], this lack of general knowledge presents a major public health concern. Additionally, educational campaigns, diabetes screening, and availability of reputable sources of information, are all lacking in rural Indian areas. Moving forward, an immediate solution is needed to combat the future impacts of the limited diabetes knowledge that is apparent in this population. Identifying individuals that are at high risk for diabetes, such as having family history, higher BMI, and abdominal obesity, is an important starting point for targeted prevention efforts. Overall, tailored strategies for public health initiatives in rural areas should be coupled with broader methods to address structural barriers that impact diabetes prevention and treatment. Improving the infrastructure and accessibility of healthcare, availability of healthy food, and sustainability of steady livelihoods, is crucial to increase diabetes prevention and diminish future impacts of NCDs in India. research site, and prior to the recruitment process, we approached local authorities (panchayat councils, local police o cials, and hospital medical staff) and sought and obtained written permission to carry out the study. Informed verbal consent was obtained from all research participants prior to enrollment and throughout the study. Verbal consent was sought in lieu of written consent due to the low literacy rate of research participants, and this consent process was approved by the University of Guelph research ethics board.

Consent for publication
No individual data are presented in the manuscript.

Availability of Data and materials
The datasets generated and analysed during the current study are not publicly available due to research participant privacy/consent agreements. Any request for raw data will be reviewed by the corresponding author.

Competing interests
The authors declare no competing interests. Authors' contributions HM conducted data analysis and wrote substantive portions of the manuscript. AP provided supervisory support to HM, co-developed the data analysis plan, and provided feedback on manuscript drafts. WD contributed to study design and provided feedback on manuscript drafts. CD and SH provided feedback on manuscript drafts and contributed to study design, including survey tools and sampling methods, provided oversight of data collection and analysis, and provided substantive edits to the manuscript. KP provided oversight of data collection and maintained partnerships with local community organizations. ML conceptualized the project, developed the study design, managed the research team, conducted data collection, provided supervisory support to HM, and wrote and edited substantive portions of the manuscript.