Data source
The panel data of new AIDS infection rate and control variables in each country in this study were from the open source data website "our world in data" ( https://ourworldindata.org/ ). The data of 172 countries from 1996 to 2019 on the number of new HIV infections, GDP per capita, population, Average total years of schooling for adult population, Human Development Index (HDI), Share of deaths due to unsafe sanitation, Life expectancy at birth, Urban population, Human rights protection, Prevalence of drug use disorders were extracted. The new infection rate was calculated according to the number of new infections and the total population in that year, and the unbalanced panel data was constructed, To ensure the analyzability of the data, we only used the data of countries with observation time of more than ten years.
Statistical analysis
This study regarded the implementation of the same-sex marriage legalization policies in various countries as a quasi-natural experiment, and explores the impact of the same-sex marriage legalization policies in various countries on the new HIV infection rate based on the difference in difference method (DID).
DID method is often used in quasi-experiment to evaluate treatment effect such as enforcement of a policy. In this paper, we divided the countries into two groups: countries with the policy enforcement were in treatment group (\({G}_{i}=1\)), while others were in control group(\({G}_{i}=0\)). Let t represent the treatment stage: t = 1 before treatment and t = 2 otherwise, and D = 1 if the unity was treated by the policy and 0 otherwise. Therefore, \({Y}_{it}\left(D\right)\) was explained as outcome of country i at certain stage t and whether it is treated at the time. To evaluate the treatment effect in treatment group (ATT) is to estimate \(\text{E}\left\{{Y}_{i2}\left(1\right)-{Y}_{i2}\left(0\right)|{G}_{i}=1\right\}\). Under parallel trend assumption and after following transformation we can get:
The last equation holds because of parallel trend assumption: If no treatment to both group, all countries will have the same trend of their outcomes. To estimate ATT, we can use linear model which is equivalent to former equation. A simple model is shown as :
$${Y}_{it}={\alpha }_{0}+{\alpha }_{1}{G}_{i}+{\alpha }_{2}{T}_{t}+{\alpha }_{3}{D}_{it}+{\beta }{X}_{it}+{\epsilon }_{it}$$
Where \({G}_{i}\) and \({T}_{t}\) are all dummy variables, \({D}_{it}= {G}_{i}\times {T}_{t}\), \({X}_{it}\) are covariate variables. In the equation, we can see \({\alpha }_{3}\) is the treatment effect and we want to estimate it in later models.
Our study divides countries into treatment group and control group according to the legalization of same-sex marriage. The treatment group is countries that implement policies related to the legalization of same-sex marriage, including countries that have formal legislation to protect the rights of same-sex couples (the Netherlands, Canada, etc.), countries that recognize same sex civil union (Chile, Italy, etc.) and countries where the approval of same-sex marriage by the court is not unconstitutional and civil unions of the same sex have been recognized in some regions (Japan). After screening, 35 countries including the Netherlands, Canada, Argentina, Uruguay, France, Brazil, the United Kingdom, the United States, Germany, South Africa and so on are selected as the treatment group of policy impact, and 137 countries including China, South Korea, Russia, Poland, Egypt and Tanzania are selected as the control group. The explained variable is the new HIV infection rate of each country (region). Control variables include GDP per capita, Average Total Years of Schooling for Adult Population, HDI, Share of deaths due to unsafe sanitation, Life expectancy at birth, Urban population, Human rights protection, Prevalence of drug use disorders were extracted. Table 1 reports the characteristics of the core explanatory variables and outcome variables, Table 2 reports the new HIV infection rate in countries of the treatment group, and Table 3 reports the characteristics of the control variables.
Static DID model
We used a static DID model to identify the impact of legalization of same-sex marriage on the new HIV infection rate. The basic assumption of DID model is that if the countries that legalized same-sex marriage did not implement the policy, the new HIV infection rate of it should have the same time trend as the countries that did not legalize same-sex marriage (i.e. the parallel trend assumption, which will be verified in event study). A potential threat to this hypothesis is that countries do not adopt policies randomly. To address this issue, we estimate the effects by controlling fixed effects across countries. The model is shown as Eq. (1) :
$${AIDS}_{it}=\alpha +\beta {D}_{it}+{{X}_{it}}^{{\prime }}\theta +{\mu }_{i}+{\gamma }_{t}+{\epsilon }_{it}, \left(1\right)$$
where the subscript i represents \(\text{i}\text{t}\text{h}\) country and t represents the year. \({AIDS}_{it}\) indicates the new rate of HIV infection in country i in year t. \({D}_{it}\) indicates whether country i has implemented the policy of legalizing same-sex marriage in year t, which is a dummy variable and \(\text{D}=1\) indicates that the policy has been implemented in the country.\({\mu }_{i}\) represents fixed effects across countries, which are used to capture all country characteristics that are not changing over time and have the potential to influence the new HIV infection rate. \({\gamma }_{t}\) represents year fixed effects, which control shocks affecting all countries in a given year. \({X}_{it}\) represents other control variables. \({\epsilon }_{it}\) represents a random error term, and here we adopt the cluster standard error in the country dimension.
Dynamic DID model
The representation of dynamic DID model is as following equation
$${AIDS}_{it}=\alpha +\sum _{k=5-}^{-2}{\beta }_{k}{D}_{ik}+\sum _{j=0}^{5+}{\beta }_{j}{D}_{ij}+{X}_{it}^{{\prime }}\theta +{\mu }_{i}+{\gamma }_{t}+{\epsilon }_{it} \left(2\right)$$
Where \({AIDS}_{it}\) represents the number of newly infected persons in \(t\)th year of the \(i\)th country;\({D}_{ik}\) and \({D}_{ij}\) represent that for each country in the treatment group, D = 1 when the difference between current year t and treatment year \({t}_{0}\) is equal to j or k. When the relative treatment year is less than or equal to -5 or greater than or equal to 5, \({D}_{-5}\) and \({D}_{5}\)=1,otherwise 0; The values of \({D}_{ik}\) and \({D}_{ij}\) are constantly 0 for the countries in the control group.\({X}_{it}\) is the control variable mentioned above. \({\mu }_{i}\) and \({\gamma }_{t}\) are fixed effect of country and year, respectively, and \({\epsilon }_{it}\) is the error term. We take period \({D}_{1}\), namely the period before the policy occurred, as the baseline period to study the treatment effect of each period after the legalization of same-sex marriage on the new incidence of HIV infection.
Table 1
Core explanatory variables and outcome variables
Core explanatory variables and outcome variables | Total observed value | Mean | Standard deviation |
Legalization of same-sex marriage | 3654 | 0.2137 | 0.4100 |
New HIV infection rate (‰) (policy not implemented) | 3364 | 0.9450 | 2.6365 |
New HIV infection rate (‰) (policy has been implemented) | 290 | 0.7991 | 2.6993 |
Data source: “Our World in Data”(https://ourworldindata.org/) |
Table 2
Treatment group: new HIV infection rate before and after policy treatment (‰)
Country | Time of legalization of same-sex marriage | Before policy implementation | After policy implementation |
Mean | Standard deviation | Mean | Standard deviation |
Netherlands | 2000 | 0.0510 | 0.0017 | 0.0410 | 0.0077 |
Spain | 2005 | 0.0329 | 0.0060 | 0.0619 | 0.0276 |
Canada | 2005 | 0.0719 | 0.0028 | 0.0750 | 0.0142 |
Portugal | 2010 | 0.1356 | 0.0613 | 0.1323 | 0.0324 |
Argentina | 2010 | 0.1922 | 0.0123 | 0.2331 | 0.0185 |
Uruguay | 2013 | 0.1566 | 0.0260 | 0.2272 | 0.0226 |
New Zealand | 2013 | 0.0178 | 0.0055 | 0.0305 | 0.0020 |
France | 2000 | 0.0535 | 0.0029 | 0.0408 | 0.0049 |
Brazil | 2013 | 0.2640 | 0.0225 | 0.3139 | 0.0065 |
United Kingdom | 2014 | 0.0845 | 0.0104 | 0.0816 | 0.0007 |
United States | 2015 | 0.1589 | 0.0137 | 0.1992 | 0.0067 |
Mexico | 2009 | 0.0945 | 0.0122 | 0.1244 | 0.0076 |
Germany | 2000 | 0.0332 | 0.0048 | 0.0256 | 0.0020 |
Australia | 2017 | 0.0530 | 0.0078 | 0.0585 | 0.0010 |
South Africa | 2000 | 14.4938 | 0.8845 | 10.3493 | 2.3629 |
Italy | 2016 | 0.0452 | 0.0073 | 0.0618 | 0.0034 |
Croatia | 2014 | 0.0102 | 0.0024 | 0.0107 | 0.0009 |
Chile | 2015 | 0.1272 | 0.0260 | 0.1963 | 0.0162 |
Cyprus | 2015 | 0.0078 | 0.0015 | 0.0103 | 0.0001 |
Czech Republic | 2006 | 0.0063 | 0.0010 | 0.0093 | 0.0005 |
Estonia | 2014 | 0.1156 | 0.0575 | 0.1380 | 0.0058 |
Greece | 2015 | 0.0147 | 0.0057 | 0.0148 | 0.0003 |
Luxembourg | 2014 | 0.0226 | 0.0071 | 0.0318 | 0.0030 |
Denmark | 2012 | 0.0424 | 0.0029 | 0.0434 | 0.0025 |
Iceland | 2010 | 0.0180 | 0.0053 | 0.0202 | 0.0025 |
Austria | 2010 | 0.0695 | 0.0136 | 0.0771 | 0.0061 |
Norway | 2008 | 0.0358 | 0.0025 | 0.0381 | 0.0038 |
Belgium | 2003 | 0.0959 | 0.0076 | 0.0949 | 0.0133 |
Finland | 2014 | 0.0090 | 0.0016 | 0.0106 | 0.0004 |
Columbia | 2016 | 0.1204 | 0.0290 | 0.1727 | 0.0032 |
Slovenia | 2017 | 0.0032 | 0.0009 | 0.0037 | - |
Mali | 2014 | 0.0351 | 0.0112 | 0.0547 | 0.0018 |
Ireland | 2015 | 0.0149 | 0.0025 | 0.0248 | 0.0015 |
Sweden | 2009 | 0.0119 | 0.0025 | 0.0220 | 0.0024 |
Data source: “Our World in Data”(https://ourworldindata.org/) |
Table 3
Descriptive statistics of control variables
Control variable | Observed value | Mean | Standard deviation | Min | Max |
GDP per capita (1000 dollars) | 3,654 | 17.56 | 18.99 | 0.51 | 114.89 |
Average Total Years of Schooling for Adult Population (year) | 3,654 | 7.75 | 3.17 | 0.90 | 14.20 |
HDI | 3,654 | 0.67 | 0.16 | 0.24 | 0.95 |
Share of deaths due to unsafe sanitation (%) | 3,654 | 1.4 | 2.1 | 0.0 | 11.1 |
Life expectancy at birth (year) | 3,654 | 69.10 | 9.51 | 35.38 | 84.36 |
Urban population (%) | 3,654 | 55.2 | 22.6 | 7.4 | 100.0 |
Human rights protection | 3,654 | 0.65 | 1.49 | -2.56 | 5.34 |
Prevalence of drug use disorders (%) | 3,654 | 0.7 | 0.4 | 0.2 | 3.7 |
Data source: “Our World in Data”(https://ourworldindata.org/) |