Unveiling the Causal Mechanisms Within Multidimensional Poverty

Despite improvements in the design of development interventions from the perspective of the Sustainable Development Goals (SDGs), there is still a lack of evaluation methods able to estimate the impact of these interventions on multiple and interrelated outcomes. This paper proposes a methodological framework for complex causal inference in international development that combines machine learning and econometric designs for causal inference. As a study case, the relationship between multidimensional poverty and violence in Colombia is evaluated following this framework. First, Bayesian networks (BN) are used to create a directed acyclic graph (DAG) able to predict how multidimensional poverty components are interrelated and affected by a violence indicator. Second, the DAG output is used to identify instrumental variables (IV) in order to test the effect of multidimensional poverty on a household’s likelihood to be a victim of violence. Minimum living standards—measured in terms of access to water, connection to the sewage system, and the quality of walls and floors—are strong predictors of the education and health dimensions of poverty. Using 2SLS, the results show that having an illiterate person within a household increases by 0.4% the household’s likelihood to be a victim of violence. BNs have the potential to predict complex causal patterns helping to understand the effect of development interventions on multidimensional outcomes such as poverty. Quasi-experimental econometric designs can then be used to test some of these predicted causal connections.


Introduction
The UN Sustainable Development Goals (SDGs) created a new framework to rethink economic and social development around the world.This framework helped policymakers to design more comprehensive development interventions, taking into consideration multiple dimensions such as human health, the environment, and global peace.Even though the SDGs have inspired a new generation of development practitioners, the current impact evaluation methods are still limited in their possibilities to understand the causal effect of policies and programs on multiple outcomes.
Aligned to the SDGs is the Global Multidimensional Poverty Index (MPI) introduced in 2010 by the UN Human Development Report.This index provides an approximation to understanding poverty as a multidimensional phenomenon.However, evaluating the impact of development interventions on multidimensional poverty is not a straightforward endeavor.Variations in this outcome can be driven by one dimension, multiple dimensions, or the interaction between them.
Complexity science is an emergent field of interest within program evaluation that precisely responds to the need for new methods able to approximate non-linear causality, where an outcome is explained by the interaction between multiple variables.Several authors have embraced this approach (e.g., Bamberger et al., 2015;Patton, 2011;Pawson, 2013), but there is still no consensus on how to conduct impact evaluations looking at such type of complex causal relationships.
This paper responds to the above gaps in the literature, by introducing a methodological framework to study complex causality in impact evaluations and in particular, in the field of development economics.This framework integrates machine learning with econometrics to unveil the causal mechanisms within multidimensional poverty and to test the causal relationships between multidimensional poverty indicators.
Machine learning (ML) is still a relatively unknown toolkit in the impact evaluation world, currently dominated by the use of experimental and quasiexperimental designs for causal inference.Other fields, such as computer science, embrace ML as an effective prediction strategy.Even though prediction and causality are different things, there is a potential complementarity between both in order to identify and test complex causal patterns.
The methodological framework in this paper proposes a way to inform quasi-experimental econometric designs using the output of a Bayesian network (BN).BNs are probabilistic graphical models that use ML principles to predict causal connections between multiple variables.As a study case, the empirical strategy uses BNs to predict the causal relationships between MPI indicators and violence in Colombia, and then it uses this output to identify instrumental variables that could help to test the causality between violence and multidimensional poverty.
The contribution of this paper is twofold.First, it introduces a methodological framework for complex causal inference in impact evaluation with applications in development economics.This framework is part of an emergent effort to bridge econometric designs for causal inference with machine learning techniques used for prediction (e.g., Athey, 2019;Mullainathan & Spiess, 2017).Second, it provides new insights into the causal mechanisms within multidimensional poverty in Colombia.These new insights have the potential to inform policymakers on which are the key drivers of multidimensional poverty and how to target them.Furthermore, these insights help to understand which are the key dimensions of poverty explaining the relationship between violence and socioeconomic development in Colombia.
Another contribution is in improving our understanding of how to select MPI indicators.Evidence worldwide shows that these indicators are often correlated raising questions on whether they are complements or substitutes (e.g., Alkire et al., 2015;Alkire & Fang, 2019;Dotter & Klasen, 2017).This paper sheds light on that direction, suggesting that these correlations might be driven by causal connections between poverty indicators, and that these connections vary depending on the context.For example, the results show that in Colombia, the relationship between MPI indicators is different when comparing urban and rural households.
This paper is divided into six sections including this introduction.The Multidimensional Poverty section provides background on the concept of multidimensional poverty.The Data section describes the data including the definition of variables, sample restrictions, and descriptive statistics.The Empirical Strategy: Machine Learning Approach section contains the empirical strategy and results of the machine learning approach.This includes background on Bayesian networks (BN) and the analysis of MPI indicators.The Study Case: Econometric Design and Results section corresponds to the econometric approach, including the definition of the IV design based on the BN findings and a summary of the results.Finally, the Conclusions and Policy Recommendations section concludes with a discussion of policy recommendations.

Multidimensional Poverty
The concept of multidimensional poverty is influenced by Amartya Sen's theoretical framework.According to Sen, poverty should be understood as a deprivation of capabilities that are related to "[...] our ability to achieve various combinations of functionings that we can compare and judge against each other in terms of what we have reason to value" (Sen, 2009, p. 233).These functionings can be described as individual goals, and the capabilities as the actual possibilities to achieve those goals.This definition makes a distinction between suspected means for development, like money, and actual means such as education, health, and living standards.Therefore, education is understood as a poverty dimension itself and as a criterion for poverty measurement.
Poverty dimensions are considered as human development areas, and they can be measured according to specific poverty indicators that are proxies to capture the degree of development in those areas.For example, indicators such as school achievement or school attendance could be good proxies to capture the degree of development in the education dimension of poverty.

Multidimensional Poverty Measurement
The most influential poverty measurement inspired by Sen's framework is perhaps the Global Multidimensional Poverty Index (MPI) based on the Alkire-Foster (AF) Counting Methodology-also known as M α (Alkire & Foster, 2011).The Global MPI is calculated every year by the United Nations Development Programme (UNDP) taking three dimensions into consideration: education, health, and standard of living.A similar definition also based on the AF methodology is the Colombian MPI that considers five dimensions: education, health, housing, work, and childhood and youth.Table 1 provides a comparison between both definitions including the different indicators within each poverty dimension, cut-off lines, and weights assigned to each indicator.This analysis uses an approximation to the Colombian MPI definition based on census data from 2018, which takes eleven indicators from a total of fifteenth indicators listed in Table 1.
The multidimensional approach follows the same logic of income-based poverty measures.In the same way, in which simpler measures classify people between poor and non-poor based on one indicator (e.g., income) and one cutoff line (e.g., $1.90 per day), the multidimensional approach uses multiple indicators and multiple cut-offs to make the same classification.For example, based on the information in Table 1, the global MPI states that a household is 1112 Evaluation Review 47(6) considered deprived of education if "no household member aged 10 years or older has completed 6 years of schooling" (UNDP, 2019).After assessing a household's deprivations based on all the defined indicators, a weight is assigned to all the cases in which the household is below the minimum standards. 1Then, all the weights are added up in an MPI score that is compared against a cross-dimensional cut-off line.Both the Global and the Colombian MPI definitions, consider a conventionally cross-dimensional cutoff line of 1 / 3 , meaning that a household is considered multidimensional deprived (or poor) if its total MPI score is greater than 1 / 3 .Although the MPI seems to be a less straightforward method for poverty measurement-compared to conventional unidimensional approaches, it represents a more accurate approximation to picture human development.Empirical results show a mismatch between income-based and multidimensional poverty measures, indicating that multidimensional poor people are not necessarily monetarily poor.This means that people living with more than $1.90 per day might not have access to adequate living standards, health, and education, undermining in that way the value of income as a proxy for development.The empirical evidence for this mismatch is available for countries such as Ethiopia, India, Peru, Vietnam (Kim, 2019;Roelen, 2017), Rwanda (Salecker et al., 2020), Germany (Suppa, 2016), and China (Alkire & Shen, 2017).
The policy implications of this mismatch are reflected in the type of government initiatives to reduce poverty based on different evaluation criteria.For example, in the case of Vietnam, where less than fifty percent of the monetary poor people are also multidimensionally poor, GDP growth easily translates into monetary poverty reduction but not necessarily into MPI reduction (Tran et al., 2015).Having the MPI as an evaluation criterion incentivizes governments to implement programs and policies intended to promote the overall levels of education, health, and living standards.Recent papers have attempted to use the MPI in impact evaluations (e.g., Malaeb & Uzor, 2017;Mitchell & Macció, 2021;Seth & Tutor, 2018;Song & Imai, 2019;Vaz et al., 2019).However, there is still no consensus on how to evaluate the impact of development interventions on the multiple outcomes within MPI.
Understanding the causal mechanisms within multidimensional poverty can also help to explain the mismatch between monetary and multidimensional poverty.This new insight can also help to improve the design of costeffective policy interventions to reduce multidimensional poverty.

Multidimensional Poverty and Violence
As a case study to put into practice the empirical strategy in this paper, this study explores the relationship between multidimensional poverty and violence which is an underexplored area in the literature.Alkire (2007), for example, talks about "physical safety" as one of the missing dimensions in multidimensional poverty.By this dimension, the author refers to violence as the main threat affecting a person's security.The exclusion of this dimension is in part due to the lack of comparable data across countries; however, Alkire indicates that the missing dimensions are often implicitly measured in their causal connections with the existing ones.For instance, the author notes that "the lowest ranking countries in terms of the HDI 2 are countries in or emerging from violent conflict" (p.350).Violence, therefore, might have a direct effect on multidimensional poverty by decreasing the living standards and access to social services such as health and education.This could be the case if people are afraid to leave their homes, travel to work or school, or obtain health care under threat of violence in their community.In such a scenario, it makes sense to study the extent to which changes in levels of violence affect multidimensional poverty components.
Mahadevan and Jayasinghe (2019) conduct the first empirical attempt to understand the relationship between violence and multidimensional poverty in the case of Sri Lanka.The authors study the transition from war to peace after 30 years of ethnic war in the country and find a reduction in MPI and its components during this transition.Nevertheless, this evidence is not conclusive because the authors do not control for the problem of endogeneity in the data.
In the case of monetary poverty measures, there is evidence about the relationship between poverty and violence.This relationship has often been described as a vicious cycle in which violence produces poverty and poverty produces violence (Justino, 2012).Evidence from different contexts around the world also suggests that monetary poverty increases the risk for armed conflict (Blattman & Miguel, 2010) and this evidence is also available for Colombia (Cotte-Poveda, 2011).
Even though there are no direct studies about the relationship between MPI and violence in Colombia, there are studies looking at the causal effect of violence on different MPI components during the last five decades of conflict in the country.In the case of living standard indicators, some of these studies have used instrumental variables to estimate negative outcomes in terms of human displacement and the loss of assets (Ibáñez & Moya 2006, 2010;Ibáñez, 2008).In the case of education, some of these studies have used OLS with fixed effects and difference-in-difference regressions to estimate a lower degree of educational attainment in conflict zones (Wharton & Uwaifo Oyelere, 2011), a decrease in public expenditure on education (to compensate for higher policing expenses), physical obstacles to attending schools, and the death of family members forcing children to abandon school at an early age (Fergusson et al., 2020).This paper provides new evidence on the ways in which violence can affect multidimensional poverty in Colombia.

Data
This study uses data from the 2018 Colombian Census conducted by the National Department of Statistics (DANE).This is a cross-sectional dataset, at the household level, and including most of the variables required to calculate the official Colombian MPI.Furthermore, a proxy variable for victims of violence is calculated.This proxy indicates those households with a deceased man between 15 and 50 years old who passed during the last year.Interpersonal violence is the main cause of death among males in that age range in Colombia, which is equivalent to approximately 55% of all the deaths (WHO, 2018).Even though not perfect, this proxy represents the best available guess to indicate which multidimensionally poor households are likely to have been victims of violence in the country.
Besides the violence proxy, this paper uses eleven poverty indicators (1,0) in this analysis.Five of them are living standards variables indicating whether a household is overcrowded or deprived in access to water, sewage system connection, and the quality of floors and walls.Overcrowded households are those where the number of people per bedroom is greater than two.Water and sewage deprivation indicates that the household does not have a connection to the local water supply and sewage system, respectively.Deprivation of floors indicates that the main material of the household's floors is dirt, and deprivation of walls indicates that the household has walls made from raw wood, vegetable material, zinc, cardboard, or no walls at all.
Three variables are classified within the education dimension of poverty: illiteracy (literacy_d), low school achievement (school_achievement_d), and low school attendance (school_attendance_d).Illiteracy indicates those households with at least one member who does not know how to read or write.Low school achievement indicates those households where there is at least one person with less than 9 years of schooling and who is older than 15 years.Low school attendance indicates those households where at least there is one individual between 6 and 16 years old who is not attending school.
Two variables are classified within the work dimension: child labor and unemployment.Child labor (child_labor_d) indicates those households where at least one person is younger than 16 and worked at least 1 hour during the last week.This work includes paid and non-paid work at a business place.Unemployment (employment_d) indicates those households with at least one person older than 15 who searched for jobs during the last week.
Finally, one variable is classified within the health dimension.Access to health care (sick_d) indicates those households where at least one person was sick during the last 3 weeks and did not receive adequate medical treatment.For example, this includes the case of people treated by a local indigenous healer.
The total sample size is approximately 8 million observations at the household level excluding missing values.Some sample restrictions are created to study the relationship between MPI indicators.First, only poor households defined as those with a multidimensional poverty index greater than 1/3 are taken into consideration.This helps to only study variations in multidimensional poverty indicators of already poor households.Otherwise, the results would be comparing poverty indicators between poor and non-poor households.Second, the data are divided into two samples, one for rural areas and another one of urban areas.The main rationale for this separation is to acknowledge that the nature of urban poverty is different from rural poverty.
Table 2 shows descriptive statistics on the key variables included in the analysis.Approximately 0.2% of the households have child labor and 12% have some sort of unemployment.Within the living standards dimension, approximately 9% households do not have access to water, 18% have no connection to the sewage system, 5% do not have adequate floors, 4% do not have adequate walls, and 6% of them are overcrowded.Within the education dimension, 10% of the households are deprived in literacy, 58% are deprived in school achievement, and 1% are deprived in school attendance.Only 3% of the households are deprived in access to health care and 2% seem to be victims of violence according to the violence proxy.Finally, 13% of the households are in rural areas and the combination of all the above indicators suggests that 5% of them are multidimensionally poor.

Empirical Strategy: Machine Learning Approach
The guiding research question in this paper is to what extent do multidimensional poverty indicators interact with each other?This is an exploratory question intended to approximate the complex causal pathways within MPI.Once those connections are identified, a secondary research question is which MPI indicators mediate the effect of being multidimensionally poor on the likelihood to be a victim of violence?The main hypothesis is that, conditional on all the MPI indicators, access to education is the main mechanism explaining the relationship between violence and multidimensional poverty.Furthermore, this paper explores MPI differences between urban and rural areas when answering these questions.
The empirical strategy to test this hypothesis integrates a machine learning approach to predict causal links with an econometrics approach to test the causality between those links.In the first step, Bayesian networks (BN) are used to predict the causal connections between MPI indicators.In the second step, the BN output is used to identify instrumental variables (IV) that can help me to test the causal effect of MPI indicators on violence.

Background on Bayesian Networks
BN are probabilistic graphical models that describe causal patterns between variables in complex systems via a directed acyclic graph-DAG (Rebane & Pearl 1987).This analytical tool developed by Pearl (1982) played a central role in the emergence of machine learning and is a useful way to predict complex causal behavior (Pearl & Mackenzie 2018).Two main assumptions characterize BNs.First, a probabilistic theory of causality where causal effects are described as the probability of an event occurring and not in terms of counterfactuals (Hitchcock 2018).Second, the Markov condition according to which the variables (or nodes) in the network are conditionally independent of each other (Cartwright 2007).In some sense, BNs are analogous to structural equation models-SEM (Druzdzel & Simon 1993).The main difference is that while SEMs allow estimating parameters with the size of a causal effect, BNs are non-parametric structural models that predict the direction of a causal relationship.The main advantage of BNs is that there is almost no limit to the level of complexity in the analysis.Using machine learning, it is possible to predict complex causal connections without simplifying the number of meaningful variables in the analysis.Then, these meaningful connections can be tested using SEM or other econometric approaches to causal inference.
BNs are common analytical tools in computer science but virtually unexplored in economics.In an attempt to understand this gap, the recent Nobel Prize winner in economics, Guido Imbens, compares the effectiveness of DAGs against conventional econometric approaches to causal inference (Imbens, 2020).First, the author points out that the conditional probabilities used in DAGs are analogous to OLS regressions with multiple controls.Second, he indicates that a clear advantage of DAGs is that they allow approaching complex causality and the causality of non-manipulable variables (e.g., gender or parent's education).Third, he identifies that while structural equation models in econometrics are based on theory, they are based on machine learning in DAGs.Fourth, he states that another important difference is that DAGs focus on identification rather than on inference.Fifth, another difference is that is that DAGs are not cyclical while economics tends to be cyclical (e.g., when supply goes up, prices go down).Finally, the author concludes that the lack of value of DAGs in economics comes from the lack of clear examples on how to use them.
The study of multidimensional poverty represents a good example on how to use DAGs in economics and their complementarity with econometrics.Following Imbens, this complementarity lies precisely in understanding that DAGs are good for identification while econometrics is good for inference.DAGs are not necessarily better but good complements of quasi-experimental designs for causal inference such as regression discontinuity, synthetic controls, and instrumental variables (IV).An IV approach is used in this paper considering that BNs are helpful tools for the identification of IVs to address the problem of endogeneity in the data.

Calculation of Bayesian Networks
BNs are calculated following a four-step procedure.First, I predict the causal structure within a set of meaningful variables using a hill-climbing learning algorithm.This set of variables is chosen according to a theoretical framework that corresponds to the MPI definition used in this study and a proxy to indicate which households are victims of violence.Second, I calculate conditional probabilities to describe the relationship between variables within the BN.Third, I choose a probability threshold to define which of the identified connections should be considered as meaningful links (or edges) in the network-this probability threshold is also known as the ark strength.And fourth, I plot the BN (or DAG) based on a joint probability distribution.I use the "bnlearn" library in R to perform these four steps.The example below helps to illustrate this four-step procedure.

Example on How to Calculate Bayesian Networks
Bayes theorem is the main reference point to calculate conditional probabilities in the BN.Supposing two dummy variables (or nodes) in the network, known as deprivation in access to healthcare (sick_d)  In the context of multiple variables and big data, it is virtually impossible to manually calculate all the possible conditional probabilities in a dataset to predict which patterns repeat the most suggesting a causal trend.This would be the case if, for example, we find that most of the households that are deprived in school achievement are also deprived of access to healthcare but not the other way around, which suggests a causal effect of school achievement on healthcare.However, this assessment becomes more difficult if we add other variables such as deprivation in a living standard indicator such as the quality of walls (walls).Machine learning becomes handy when using multiple variables in the analysis.Using a hill-climbing learning algorithm, I can predict the causal direction between multiple MPI indicators.This algorithm predicts which patterns repeat the most and then the BN is calculated based on the following joint probability distribution: The BN DAG could be expressed in the following way according to the above joint probability distribution: This DAG represents the predicted causal structure in which deprivation in walls (walls) directly affects deprivation in school achievement (school_achievement_d) and deprivation in access to healthcare (sick_d).Plotting this graph also requires deciding on the arc strength, which is a minimum probability threshold indicating which "predicted causal" connections should be considered as meaningful.This probability threshold goes from 0 to 1 and it is analogous to a statistically significant relationship.There is no consensus on what represents a reliable minimum threshold when calculating BNs, but I use an arc strength of 0.99 in this study to decide which are meaningful connection between MPI indicators.Now, to illustrate how BN can help to uncover complex causality let us suppose that violence is another meaningful variable in the network.This network suggests that multidimensional poverty has a direct effect on violence via school_achievement_d.If this is the only connecting point between the system of MPI indicators and violence, then the variable walls would meet the validity assumptions to be treated as an IV.These assumptions indicate that walls is not correlated with violence, and that it has an effect on violence via its effect on school_achievement_d.

Multidimensional Poverty Index
To keep consistency in the analysis of multidimensional poverty, I only include multidimensionally poor households into the calculation of Bayesian Networks.This helps me to only study the relationship between MPI indicators within already poor households.Otherwise, I would be including MPI indicators of non-poor households who might have a different relationship between variables (e.g., a non-poor household can still have floors made from raw wood).
I calculate a Multidimensional Poverty Index (MPI) to identify poor households using the Alkire-Foster (AF) methodology (Alkire and Foster, 2011) and according to the official Colombian multidimensional poverty definition. 3The following equation calculates a multidimensional poverty score assigning equal weights to each of the four dimensions of poverty: education, work, health, and living standards.
Poor households are classified as those with an MPI score greater than 1/3.In the second part of the empirical strategy, I calculate the IV regressions using the sample without restrictions (including poor and non-poor observation), using separate samples for rural and urban households, and using separate samples for poor households located in rural and urban areas.

Results
Figure 1 provides the first attempt to model the causal structure within multidimensional poverty in urban areas of Colombia using a Directed Acyclic Graph (DAG).What is novel about this DAG, is that it is calculated using machine learning instead of a theoretical approach.Using an arc strength of 0.99 (analogous to a significance level of 0.01), the Bayesian network results suggest that living standard indicators-such as access to water, sewage system, quality of the walls and floors-are the main drivers of multidimensional poverty predicting other MPI indicators in the education and health dimensions.Deprivation in living standards indicators seems to predict a household's deprivation in employment, school achievement, literacy, and access to healthcare (sick_d).
An example to understand the probabilistic approach to causality in Figure 1, is that households deprived in access to water, sewage system connection, quality of floors and walls, overcrowding, and literacy, also seems to be deprived in access to healthcare but not the other way around.This means that households without access to healthcare are not necessarily likely to be deprived in living standard indicators.That is the reason why the DAG predicts a causal effect from living standards to health indicators and not in the opposite direction.
Figure 1 also shows that the presence of child labor within a household predicts school absenteeism (school_attendance_d) and that this relationship is independent of minimum living standards.The multiple arrows coming out of each node (or MPI indicator) in Figure 1 show a high degree of endogeneity among MPI indicators, and therefore, it is not possible to select a good IV candidate out of this graph.
Figure 2 shows the same results but for the specific case of rural households in Colombia.The results are very similar, but it is less clear here that living standards are the main drivers of multidimensional poverty.Deprivation in access to water, the quality of floors, and walls, seems to predict other MPI indicators such as school achievement, school attendance, and literacy.However, the presence of child labor also seems to predict school attendance and the quality of floors.There seems to be more endogeneity between MPI indicators in the case of rural households.
An interesting key difference between urban and rural MPI, is that access to healthcare (sick_d) seems to only predict unemployment in rural areas of Colombia.Households without access to healthcare are classified as those where at least one person was sick during the last 3 weeks and did not receive adequate medical treatment.Considering that employment among poor households in rural areas involves physical labor, it makes sense to think that adequate physical health is a requirement to secure a job.Like in the case of urban areas, there are no good candidates for IVs in Figure 2.
Figure 3 displays the same information in Figure 1 but including the violence proxy.The first interesting finding is that including this variable does not alter the relationship between MPI indicators described in Figure 1.The second interesting finding is that illiteracy (literacy_d) predicts a household's likelihood to be a victim of violence (violence_proxy).This is the only connection between MPI indicators and the violence proxy, suggesting that the variables only affecting literacy_d-such as overcrowding, walls, floors, unemployment, and school achievement-are good IV candidates to test the causal effect of literacy_d on violence_proxy.
Identifying instrumental variables is a challenging task and Bayesian networks are not perfect shortcuts to overcome this challenge.A key identifying assumption is that the instrument should not be correlated with the outcome of interest.Despite the lack of theoretical background to understand why the different instruments suggested in Figure 3 might be unrelated to violence_proxy, this machine learning approach uses conditional probabilities to support the identification of IVs.An important caveat is that these conditional probabilities are as good as the data included in the BN.Other relevant variables that might be excluded from this analysis are not considered in the assessment of this IV strategy.In the case of rural areas, Figure 4 shows that MPI indicators seem to be unrelated to violence_proxy.This means that none of the indicators are good predictors of violence at the 0.01 confidence level.

Study Case: Econometric Design and Results
Using the results in Figure 3, the econometric strategy in this paper is defined by an IV approach where the first stage of the 2SLS regression corresponds to the following equation: The outcome is a dummy indicating which households have at least one family member who does not know how to read or write (Literacy d i ), Z 0 i is the vector of instrumental variables identified in Figure 3-overcrowding, school_achievement_d, employment_d, water, sewage, and floors-X 0 i is a vector of control variables including other relevant MPI indicators, whether the household is located in a rural area, and the households' social stratification scale (1 = lowest income -6 = highest income); φ i the municipality fixed effects, and μ it is the error term.
Using the results from the first stage, the second stage is estimated based on the following equation: where Violence Proxy i is a dummy indicating which households are likely to be victims of violence.According to the BN in Figure 3, the hypothesis to be tested in this IV design is that having an illiterate person within a household increases the household's likelihood to be a victim of violence.
Table 3 shows the results of the IV approach to test the causal effect of literacy on the violence proxy.Columns 2 and 3 show that once controlling for municipality fixed effects and the rest of MPI indicators, having an illiterate person within a household increases its likelihood to be a victim of violence by approximately 0.2 percentage points.This coefficient is statistically significant at the 0.001 level and calculated using clustered standard errors at the municipality level.The first stage in column 1 also shows statistically significant coefficients with an F-statistic of 611,400.7.Robustness checks are conducted in Tables 4 and 5.Both tables conduct the same empirical strategy but using the restricted samples used to calculate the BN in Figure 3. Table 4 uses the sample of urban households and Table 5 the sample of poor urban households.Both tables replicate the main results, with a slightly higher coefficient of 0.4% in Table 5 which describes the effect of illiteracy on the likelihood to be a victim of violence.Even though a small effect, the results in Tables 2, 4, and 5 represent good examples on how to use Bayesian networks to support the identification of instrumental variables in a complex system.

Conclusions and Policy Recommendations
Conventional experimental and quasi-experimental methods in economics are limited in their possibility to study complex causal inference.Combining machine learning and econometrics opens a new door for complex causal inference.The empirical strategy in this paper shows a way to do this by using Bayesian networks (BN) for the identification of instrumental variables (IV).The results suggest that having an illiterate person within a household increases the household's likelihood to be a victim of violence by approximately 0.4%.
Most empirical studies about the relationship between violence and education focus on the negative effect of violence on education outcomes (e.g., Diwakar, 2015;Islam et al., 2016;Grueso, 2022).This is perhaps due to the lack of data and methods to study the causal effect of education on violence.Violent events occur in a given moment and can have short-and long-term effects on education.Conversely, education takes time to build up and it is not easy to study its impact on violence using conventional experimental or quasiexperimental methods.
ML opens a new door of possibilities to study the relationship between violence and poverty and in particular, the relationship between violence and education-understood as one of the main poverty dimensions.Another opportunity in the use of ML, is that it helps us to think about other types of theories and hypotheses that might be out of the scope of current mental frameworks.For example, the absence of studies looking at the causal effect of education on violence can be explained by the lack of theoretical models explaining that effect.
Using a ML approach, this paper explores the causal effect of illiteracy on the likelihood to be a victim of violence in Colombia.This is an example of how ML can indicate new research paths, looking into directions that are not mainstream in the current literature.In the same way in which ethnographers identify patterns in qualitative data to then suggest a theory of how the world works, ML identifies patterns in quantitative data that can suggest new theories about the world.These theories can then be tested following experimental or quasi-experimental designs for causal inference.
An interesting policy implication of this paper's results is that the effect of education on violence only seems to apply in the case of urban areas of Colombia.There seems to be a high level of endogeneity between MPI indicators in rural areas, suggesting that once controlling for the effect of all other MPI indicators violence and poverty seem to be unrelated phenomena.The same does not occur in urban areas where illiteracy seems to be the main MPI indicator explaining the relationship between violence and poverty.
Even though illiteracy is the only good candidate to follow an IV approach in this study, it is also relevant to mention the potential policy implications of the rest of causal connections in the BN that are not tested using econometrics.In the case of urban areas, the presence of child labor predicts low school attendance which also predicts illiteracy.This could be a causal path explaining how keeping children out of school could increase their risk to be victims of violence.Living standard indicators such as the quality of walls, floors, access to water and the connection to the sewage system also seem to predict illiteracy within a household.These different living standard indicators could reflect the severity of poverty within a household which might be associated with the lack of education.Further research is needed to explore at a deeper level the interrelationship between other MPI indicators affecting illiteracy.
In the case of rural areas, there is a higher degree of endogeneity reflected by a higher number of connections between the different MPI indicators.It is difficult to visually dissect possible policy implications based on the predicted causal patterns.Child labor and lack of access to water seem to be the root variables explaining the behavior of the rest of MPI indicators.However, all MPI indicators seem to be affected by many other indicators making it difficult to disentangle individual causal pathways.
This paper provides the first step to understanding how the lack of literacy and education might be trigger factors for violence in Colombia.The econometric results in this study are modest, and therefore, more research in this area is definitely needed.However, policymakers in this and other similar contexts might find it interesting to explore ways in which literacy can help to decrease violence by being a cost-effective way to promote longterm peace.The exact mechanisms by which literacy helps to decrease the likelihood of an individual to be a victim of violence are yet to be explored.For example, literacy can increase access to information via newspapers or street posters, helping in that way to predict and avoid a violent attack.Literacy could also improve individual labor market skills decreasing in that way the likelihood to get involved with organized crime.Understanding the ways in which education affects violence is an interesting new area of study and ML has the potential to open new research avenues in that regard.
Finally, it is worth mentioning that the methodological framework discussed in this paper can be used to evaluate the impact of development intervention on multiple and interrelated outcomes-such as multidimensional poverty.Policymakers might find it interesting to think about the possibility of expanding evidence-based policy in the case of interventions that cannot be easily evaluated following conventional econometric approaches to causal inference.
3. I use a close approximation to the official Colombian MPI definition that can be calculated using census data.This close approximation uses 11 out of the 15 indicators used to calculate the official national MPI.I use census data instead of the annual household survey on living standards (used to calculate the Colombian MPI), because the national census has a larger number of observations that gives me more statistical power to calculate the Bayesian network.Furthermore, the national census includes information on mortality at the household level which allows me to calculate the proxy to identify which household are likely to be victims of violence.

Figure 1 .
Figure 1.Bayesian network of Colombian MPI: Urban areas.Note: BN calculated using a machine learning approach, with an arc strength of 0.99, and based on census data from 2018.Results only take into consideration poor students (MPI Score > 1/3).

Figure 2 .
Figure 2. Bayesian network of Colombian MPI: Rural areas.Note: BN calculated using a machine learning approach, with an arc strength of 0.99, and based on census data from 2018.Results only take into consideration poor students (MPI Score > 1/3).

Figure 3 .
Figure 3. Bayesian network of Colombian MPI + Violence: Urban areas.Note: BN calculated using a machine learning approach, with an arc strength of 0.99, and based on census data from 2018.Results only take into consideration poor students (MPI Score > 1/3).

Figure 4 .
Figure 4. Bayesian network of Colombian MPI + Violence: Rural areas.Note: BN calculated using a machine learning approach, with an arc strength of 0.99, and based on census data from 2018.Results only take into consideration poor students (MPI Score>1/3).

Table 1 .
Official Colombian and UNDP MPI definitions.

Table 2 .
Descriptive statistics, MPI indicators in Conflict Zones of Colombia (2018) Note: Data are taken from the Colombian National Administrative Department of Statistics (DANE).
, which is equal to the product of being deprived in the i poverty indicator ðMPI Indicator d i Þ given the rest of poverty indicators affecting ði Á MPI Indicator d jðtÞ Þ.In the case of the previous example, the conditional probability of being multidimensionally deprived could be expressed as

Table 3 .
Effect of illiteracy on the likelihood to be a victim of violence.

Table 4 .
Effect of illiteracy on the likelihood to be a victim of violence: Urban area.

Table 5 .
Effect of illiteracy on the likelihood to be a victim of violence: poor households in Urban areas.