Considering Sex and Gender in Epidemiology: a Challenge Beyond Terminology – Methodological Research


 BackgroundEpidemiologists need tools to measure effects of gender, a complex concept originating in the humanities and social sciences which is not easily operationalized in the discipline. MethodsWe conducted a conceptual analysis and applied causal and mediation analysis methodology to standard questions in order to propose a methodologically appropriate strategy for measuring sex and gender effects in health.ResultsWe define gender as a set of norms prescribed to individuals according to their attributed-at-birth sex. Gender pressure creates a systemic gap, at population level, in behaviors, activities, experiences, etc. between men and women. A pragmatic individual measure of gender would correspond to the level at which an individual complies with a set of elements constituting femininity or masculinity in a given population, place and time. However, defining and measuring gender is not sufficient to isolate the effects of sex and gender on a health outcome. We should also think in terms of pathways to define appropriate analysis strategies. Gender could also be examined as a mechanism rather than through its realization in the individual, by considering it as an interaction between sex and environment. ConclusionsBoth analytical strategies have limitations relative to the impossibility of reducing a complex concept to a single or a few measures, and of capturing the entire effect of the phenomenon. However, these strategies could lead to more accurate and rigorous analyses of the mechanisms underlying health differences between men and women, and ultimately limit the sex and/or gender bias encountered in epidemiological and clinical research studies.


Introduction
In the medical sciences, de nitions of the terms "sex" and "gender" consensually follow the classic de nitions given by sociologist Oakley in the 1970s (Oakley 1972) : sex refers to biological differences between women and men (Johnson, Greaves, et Repta 2009;Hammarström et al. 2014) and gender refers to observed, experienced, prescribed or favored social differences, based on attributed-at-birth sex (Johnson, Greaves, et Repta 2009;Hammarström et al. 2014;King 2010;Nowatzki et Grant 2011). For more than a decade, several authors have been working to clarify these terms and have emphasized that they should not be confused (Johnson, Greaves, et Repta 2009;Hammarström et al. 2014;King 2010;Nowatzki et Grant 2011). However, although the importance of taking these factors into account in epidemiology [1] has been repeatedly emphasized (Doyal 2001;Greaves 2012), their true integration into epidemiological practice remains marginal and often approximate (Oertelt-Prigione et al. 2010). This observation could be due to a poor understanding of the de nitions by health researchers, since these terms are often used as synonyms, or interchangeably in scienti c papers (Hammarström et Annandale 2012;King 2010;Krieger 2003;Madsen et al. 2017;Ristvedt 2014). We hypothesize that these persistent di culties are partly based on lack of adaptation of these concepts imported from disciplines that do not have same challenges, paradigms and methods as epidemiology. We aim to transpose considerations about sex and gender developed in social sciences and humanities and propose a pragmatic operationalization of them, so that they can be better implemented in our eld.
The term "sex" is usually de ned by "the biological differences observed between women and men" (Hammarström et al. 2014). Some authors go further: Krieger proposes: "Sex is a biological construct premised upon biological characteristics enabling sexual reproduction" (Krieger 2003). Johnson et al. propose: "Sex is a multidimensional biological construct that encompasses anatomy, physiology, genes, and hormones, which together affect how we are labelled and treated in the world" (Johnson, Greaves, et Repta 2009). According to these de nitions, the "sex" classi cation can be understood as a social construct (Laqueur 1992;Fausto-Sterling 1993), premised upon a set of biological characteristics of different natures (genes, hormones, anatomy, etc.), directly or indirectly in connection with reproductive function, and on average strongly correlated with each other within each sex category. The term "gender", on the other hand, refers to the "social differences between men and women" (Hammarström et al. 2014). However, within this minimalistic de nition hides the complexity of this concept. Let us take three examples of de nitions from the medical literature: "Gender is a multidimensional social construct that is culturally based and historically speci c, and thus constantly changing. Gender refers to the socially prescribed and experienced dimensions of "femaleness" or "maleness" in a society, and is manifested at many levels." (Johnson, Greaves, et Repta 2009); "Social, cultural, and psychological aspects that pertain to the traits, norms, stereotypes and roles considered typical and desirable for those whom society has designated as male or female" (King 2010); "The different roles, responsibilities, and activities of women and men, which are socially constructed and result in different social expectations, opportunities, and experiences" (Nowatzki et Grant 2011). These de nitions suggest that gender is a complex phenomenon which is continuous (femininity ↔ masculinity) but strongly premised upon the binary classi cation of sex. Gender is also a multidimensional (traits, norms, stereotypes, roles, responsibilities, activities, etc.), multi-level (experienced by individuals and prescribed by society, at different structural levels, and possibly heterogeneously), intersecting (with age, ethnicity, class, etc.), highly contextual, evolving over the life course, and across generations, and highly diffuse (in society, family, work... in relations, in expectations, in perceptions, in actions, etc.).
Epidemiology sometimes investigates differences in health between men and women: as such sex and/or gender can be regarded as exposures of interest. In this case, the central question posed is: why are there differences in health between men and women? The phenomena at stake in these differences can be social and/or biological and therefore involve the concepts of sex and gender in a deeply intertwined way. For example, why are little boys more likely to have allergic diseases than little girls? Are there biological predispositions related to biological sex or is it the socially de ned gendered exposures that explains it? To answer these questions, epidemiologists need tools to capture the sex and gender phenomena and these tools must be compatible with the discipline's methods. The rst issue is the need to contain the complex concept of gender in one or several individual variables to be used in epidemiological analyses. But questioning the in uence of sex and gender on health does not only require correct measures but also analytical approaches aiming to capture the phenomena of sex and gender in their complexity and in isolation one from another. Here, we propose to summarize potential analytical approaches.

Gender, From A Population-level Concept To An Individual Variable
Gender is rst and foremost a differential social construct: it is a set of norms of different kinds (behaviors, activities, experiences) prescribed to individuals according to their attributed-at-birth sex.
Gender can therefore be conceptualized as a gender pressure that creates a systematic and systemic difference between men and women in society, observed at population level. This theorization is consistent with a sociological approach which refers to gender as a process of social division: "gender is the system of hierarchical division of humanity into two unequal halves" (Delphy 1998). The gender system divides humanity according to the typology known as "sex" and associates values, objects and properties of different kinds with each of these categories. For example, physical strength, the color blue and mathematics might be considered as masculine objects, in a given place and time. Through socialization, we suggest that the process of division will be realized through individuals by a sexdifferentiated normative pressure: "gender pressure". This hypothesis follows Durkheim's de nition of social fact: "these types of conduct or thoughts are external to the individual but, they possess an imperative and coercive power by virtue of which they are imposed on them, whether they like it or not" (Durkheim 1895). This normative pressure will, in our example, encourage men to develop their strength, to wear blue clothes, to love mathematics. Even if they do not adopt all valued attributes, a difference will be observed in the distribution of these attributes among individuals categorized by sex at the population level. Conversely, if gentleness is a feminine attribute in a given population, there will be a difference in the proportion of people described as gentle by sex, even though not all women will be described as gentle, and some men will be. Gender is performed by a gender pressure and measured at the population level by a distributional difference between the two sex categories.
Since gender is socially performed, through the collective pressure of a speci c group, we consider it as a population-level concept. In Epidemiology, however, variables are typically de ned at the individual level, which leads to several questions: how can this complex phenomenon be measured? What indicator(s) can be implemented to measure gender?
Gender is often associated with gender identity. Gender identity is de ned by Johnson et al. as "how an individual sees themselves on the continua of female or male (or as a "third gender" or "two-spirited")" (Johnson, Greaves, et Repta 2009). In fact, it would be more correct to speak of "continua of sociallyprescribed femininity or masculinity", because this perception and experience of self is highly dependent upon social prescriptions. For example, a born-male individual may feel closer to what their society prescribes as desirable for a woman and therefore see, feel, and experience themselves as feminine.
Gender identity is thus associated with gender pressure and also other social determinants (e.g., capacity for self-determination) but has limitations. It is only measured by self-reporting from an individual and may be reductive if this individual feels like a woman but does not perform this femininity through their behaviors, activities, etc. Beyond self-experienced identity, the gender of an individual is realized in what they are, what they do and what they experience. For example, an individual may be said to be masculine if they display behaviors, characteristics, positions or activities that their group considers to be masculine, thus performing a gender role (Johnson, Greaves, et Repta 2009). In addition, if an individual is perceived as a man, other individuals will act in a certain way towards them, certain opportunities will be more or less frequently presented to them, they will be exposed to their environment differently than if this individual had been considered a woman (what Johnson et al refer to as "gender relationship" and "institutional gender"). These different dimensions can be observed, measured and used to de ne a resulting expression of gender pressure in an individual. However, it is worth repeating that what is perceived as masculine or feminine is contextual and therefore, it would not be appropriate to de ne a universal and xed measure of individual gender.
A composite individual indicator can be constructed, based on the presence or absence of several gendered dimensions[1], de ned from the sex-differential distribution of these dimensions in the population. This indicator corresponds to a measure of the level at which an individual complies with a set of elements constituting femininity or masculinity in a given population, place and time. The "gender diagnosis" methods meet this objective. They are de ned as follows: "gender diagnostic (GD) refers to the Bayesian probability that an individual in a given population is male or female based on sets of gender-related indicators (such as occupational preference ratings)" (Lippa, Martin, et Friedman 2000). In practice, this corresponds to the probability of being "predicted male" from dimensions associated with masculinity, or "predicted female" from dimensions associated with femininity. This means considering an individual as being more or less masculine because they have a greater or fewer number of masculine characteristics, considered as such because these characteristics are more frequent or of a higher value among men within the population studied. This is a common measurement method of gender in gendersensitive studies (Smith et Koehoorn 2016;Pelletier, Ditto, et Pilote 2015). We have summarized the de nitions and measures in Table 1.
Gender-diagnostic methods have some limitations. Seeking to capture gender from the presence or absence of one or more gendered dimension(s) in an individual requires two precautions. Firstly, this phenomenon is not necessarily consistent at the individual level. For example, a man may have a job considered to be feminine, such as caregiving or midwifery, but display characteristics considered to be masculine in his family environment (measured from the burden of domestic tasks for example).
Secondly, the presence of one or more gendered dimensions in an individual is not necessarily due to gender pressure alone. For example, if, in a given group, smoking is much more common among men than among women, this behavior will be considered as masculine. But the fact that an individual smokes is not determined solely by this mechanism. It is not only a marker of a person's gender but also a marker of other social determinants such as socio-economic position, social network, etc. We cannot therefore consider a gendered dimension, as a 'pure and perfect' proxy of individual gender. This limitation should be kept in mind when trying to isolate the gender effect on an outcome by using gendered variables as gender markers.
In summary, a unique, universal and xed individual gender variable cannot be de ned, because it would not be compatible with the concept of gender as a normative system, by de nition heterogeneous, multidimensional, multi-level, contextual, inter-sectional and evolving. Therefore, we propose to de ne a local variable of individual gender, as a gender performance resulting from the gender pressure imposed on an individual in a speci c population, because it would be de ned based on the speci c norms of that population. Despite its limitations, this "gender diagnostic method" is a pragmatic tool. To take into account the heterogeneity of gender performance in an individual, several variables related to different ways to characterize groups and contexts could also be de ned, for example "occupational gender" and "domestic gender", or "gender role" and "gender relationship" (categories depending on available data and constraints of the research question). In any case, this (or these) variable(s) must be used with care, bearing in mind that they will never be able to capture all the dimensions of such a diffuse phenomenon and that they will also capture social phenomena that are not due to gender. Attributed-at-birth sex, in almost all cases (>98%) based on anatomical characteristics observed in the newborn.

Populationlevel gender
Population-level gender is the fact that some individual dimensions (traits, roles, responsibilities, activities, etc.) are differentially prescribed/observed according to sex in a social group.
By extension: Gender pressure Differential and social prescription of one/several dimension(s) according to the sex in a group. The stronger the gender pressure is, the more the group is said to be "gendered".
The magnitude and frequency of the observed differences in several social dimensions in a group.

Gendered dimension
Dimension differentially prescribed according to sex in a group. The stronger the difference is, the more the dimension is said to be "gendered". E.g., a dimension more prescribed/observed for men than for women can be said "masculine".
For a given dimension, the magnitude and the direction of the distributional difference observed between individuals according to their attributed-atbirth sex in a group.

Individual gender
The individual gender is the individual realization of the population-level gender.
We made a distinction between:

Gender identity
How an individual sees themselves on the continua of femininity and masculinity (or out of this continua) de ned by their social group.
Self-reported experience.

Gender diagnostic
How much an individual share one/several gendered dimension(s) of a given population, place and time. E.g., an individual who performs one or several "feminine" dimensions can be said "feminine" for this or these dimension(s).
Predicted sex or probability of being of such or such sex, computed from one or several gendered dimensions.

Gender transgressivity
How much one of several dimensions of an individual is far from the masculinity de ned in his population if they are male or from the femininity if they are female (see Supplementary data).
The fact of not having the same predicted sex than the attributed-at-birth sex or the probability of being of a different sex than the one attributed at birth.

It Is Not Just About Sex And Gender, It Is About Pathways
Even if the concepts of sex and gender have been well de ned, differentiated and measured, with all the limitations mentioned, this is still not su cient to isolate and analyze the effects of sex and gender on a health outcome, denoted Y. Mainly because individual gender, de ned as the result of gender pressure on an individual, is strongly associated with sex: if a newborn baby is de ned as male, he will be socialized as a boy, whereas if de ned as female, she will be socialized as a girl. The gendered characteristics that each child will have, even if modulated by other social and individual factors, thus strongly depend on their sex. Therefore, an association between an individual score of gender and an outcome Y cannot be interpreted as proof that gender pressure explains part of Y, because sex could be a confounder in this association. The reverse interpretation would be equally awed: we cannot conclude from an association between birth-sex and an outcome Y that biological mechanisms only explain this association and not the gender pressure, because the effect of sex on Y can be mediated by gender. It is therefore insu cient to simply avoid confusing sex with gender and the variables that measure them, we also have to avoid confusing the mechanisms that relate one to the other. To grasp these issues, we propose to think in terms of causal paths to clearly identify the mechanisms of interest and, on this basis, de ne our analysis strategy. Figure 1 allows us to visualize the sequence of causes and therefore the causal structure. This graph represents a very general scenario of a sequence of two exposures X 1 and X 2 that cause an outcome Y, each node representing a variable or set of variables. Fundamental and independent determinants of X 1 , X 2 and Y are innate factors, including sex, and environmental factors.
Strictly speaking, the "effect of sex" on Y corresponds to all directed paths that begin at the Sex node and end at Y (double arrows in Figure 2.a). However, it is sometimes implicitly suggested that when we talk about the "effect of sex", we are only talking about biological mechanisms and that we are therefore only referring to paths that would not pass through social factors. In this case, the biological effect of sex would be the direct effect (double continuous arrow in Figure 2.b) and the indirect effects which pass through exposures not linked to the environment (double dashed arrows, with the hypothesis of independence between Env and X 1 ), and assuming that no mediators with environmental determinant have been omitted.
When we observe a result where there is an association between Sex and an outcome Y, this nding corresponds to the total effect of Sex on Y ( Fig. 2.a). Again, we cannot know if this total effect is in fact explained by biological or social mechanisms, even if we have de ned the Sex variable as a biological phenomenon. It is therefore important to determine if this is really the effect of interest. By using these graphic representations, we can also highlight the complexity of isolating the biological effect of sex, which would require us to rst make the strong assumption of an independence between the environment and all the intermediate factors (as for X 1 in our example), and second to "close" all other paths to identify the direct effect.
We can also focus on gendered exposure(s). For example, if the probability of playing football X is different according to birth-sex and to the place of residence, we will say that this activity is socially determined and gendered. We would want to identify the risk factors for a rupture of the anterior cruciate ligament Y, assuming that there is no direct effect of sex on the probability of this pathology occurring but an effect of playing football X (See Figure 2.c). Since playing football is a risk factor for the disease and a gendered activity, we would nd here a "sex effect", i.e., statistical association and even a causal pathway (mediated by football X) between the variable Sex and Y. In this example, X is a gendered activity, but we could have used another gendered dimension, gender identity, a set of gendered variables or a gender diagnosis, as described above and as represented in the Figure 2.d. This gure allows us to visualize the potential confounding effect of sex and environment when we look at the effect of this kind of gender marker on Y. If we wanted to identify and measure the speci c effect of individual gender, we would have to make sure that we could control all these confounders.
These examples demonstrate that it is necessary to ensure that the assumptions regarding mechanisms and pathways of interest are clearly de ned a priori.
When we want to understand the health effects of sex and gender, i.e., describe them, distinguish them and explore their mechanisms, different questions can be addressed that do not involve the same analytical strategies. If the question is: "Are the differences in health observed between men and women explained, at least partially, by social mechanisms?", then our focus will be on the pathways operating through social dimensions of a Sex → Y effect, i.e., in the socially-mediated indirect effect of sex. If the question is "Does a gendered dimension(s), like a gender diagnosis, have an effect on health?", then our focus will be on the total effect of a gendered exposure, as described in the Figure 2.c. It is therefore important to distinguish, name, and de ne the multiple pathways that link sex and gender to the outcome. To achieve this, we propose a typology of several pathways of interest in Table 2, with corresponding examples of counterfactual formulation ("if the situation had not been as it is"). We will denote Y S=s or Y S=s, E=e the potential outcome had a subject been exposed respectively to the counterfactual interventions S=s or {S=s and E=e}. In this table, the gender variable is described as a binary variable G={f;m} in order to simplify the presentation rather than for a conceptual reason. Ideally in this typology, G should represent "being / acting / living / etc. as a man" (or "as a woman"), i.e., everything that socially makes a man (or a woman) in a given time, place and population. In this case, the direct effect, RES (what does not pass through G), would correspond to the non-socially mediated or the biological effect of sex in these time, place and population. But, as we said before, gender is so diffuse that it is impossible to think that we can capture all its dimensions in one or a few variables. With this analytical strategy, at most we can: (1) verify the hypothesis that social pathways (SMIES) explain, at least in part, a sex effect (TES), and (2) have an order of magnitude of the biological effect of sex on a phenomenon Y depending on the conceptual extensivity of G. But a RES cannot be said to be the pure biological effect of sex, even if we have considered many gendered dimensions in G.
Strategies to take into account sex and gender when these phenomena are not exposures of interest are detailed in Supplementary Data.

An Alternative Strategy: Gender As A Sex-environment Interaction
Rather than using an individual variable as a proxy for a population-level phenomenon, gender could be examined as a mechanism. This mechanism can be considered as an interaction between sex and environment. Indeed, we said above that a dimension is gendered when it is a descendant of both sex and environment, but there are cases where a dimension is a descendant of both sex and environment and is not gendered. For example, let us imagine that the head circumference of newborns is on average different depending on sex at birth. In a given society, pregnant women of one caste eat differently from others and this diet has an effect on the head circumference of newborns. In this society, however, the sex of the child before birth is not known. In this case, the newborn's cranial perimeter is a descendant of sex and the environment, but it is not gendered. It would be if the sex of their unborn child was known and if the pregnant women also ate differently according to this knowledge. So, the dimension is said to be gendered not only when it differs according to sex and to the environment, but also when the environmental cause varies according to sex. It is the de nition of an interaction phenomenon. This is why we can refer to gender as a "differential distributor" of exposures. In terms of social and biological explanations, a sex-difference that exists whatever the social group and the culture is likely to be biological, but if its effect varies greatly according to social classes or cultures, it may be mediated by social mechanisms. This echoes anthropologist Margaret Mead's conclusions that temperaments (gentleness, violence, etc.) attributed to men or women did not stem from biological sex but were socially constructed because they varied from one society to another (Mead 1963).
We can start from this de nition to de ne a strategy that will consider gender and environment in a more intertwined way. We denoted the social environment E, with E=0 if the social environment is non-gendered (or less gendered in practice, as a non-gendered environment generally does not exist) and E=1 if gendered. We denoted S the attributed-at-birth sex. We want to distinguish the effect of sex, which would occur even in a non-gendered environment; the effect of the environment, not related to a gender effect; and the effect of gender, as a sex-differentiated effect of the environment, or a socially varying effect of sex (equivalent formulation). The way of identifying effects is totally different from what we have proposed above, so we named them differently (Table 3). Table 3 Synthetic typology of sex S, social environment E and gender G effects on a health outcome Y As with the previous strategy, some limitations are found. Firstly, we simply shift the problem of measuring a complex concept with one or several variables from the realm of gender to that of the social environment. Secondly, even the WOGTES would be large and the TEG null, we could not be able to conclude that Y are not at all in uence by gender, because the measured effect would depend on the category of the environment chosen as the reference group (E=0), and a perfect non-gendered environment (E=0) is not realistic (except some special cases like sex-blind in-utero environment). Thirdly, an important condition for the successful use of this strategy would be to have a very socially heterogeneous population in order to estimate the variability of the gender effect across social groups by this TEG. On the other hand, the interest of this strategy is to be more compatible with the populationlevel nature of gender, considered as a sex-environment interaction or taking into account sex as a modi er of the effect of the environment. This approach makes it possible not to de ne a measure of gender which, even if de ned in a study in a contextual way and with all precautions, always runs the risk of being generalized and essentialized afterwards.

Main Recommendations
In summary, we can make some recommendations for improving analytical strategies when exploring the mechanisms of health differences between men and women.
1. Choose the most suitable analysis strategy between: (a) de ning and measuring gender as the individual result of gender normative pressure or (b) de ning and measuring gender as a sexenvironment interaction. A fundamental criterion is the type of study population: strategy (a) is more appropriate if socio-cultural characteristics of the population, and so its gender normative pressure, are rather homogeneous; and strategy (b) is more appropriate if socio-cultural characteristics of the population are heterogeneous.
2. According to the research question, specify the chosen strategy and, with the help of tools such as directed acyclic graphs, de ne the speci c effect to be estimated: is it the total effect of sex, the part of sex effect which are mediated by social mechanisms or a sex-controlled gender effect?
3. De ne required variables: If strategy (a) is chosen, use available data and mechanistic assumptions to de ne (see supplementary data) the variables for de ning a gender score, based on the distribution of these variables according to sex in the study population (gender diagnosis). If possible, de ne several gender variables for several dimensions (e.g., professional/domestic, etc.).
If strategy (b) is chosen, de ne a summary variable of the social environment in the sample and globally assess the variability of differences between men and women according to this variable in the study population (e.g.: income gap, employment access gap, age of rst child, etc.). De ne a reference category with the least gendered groups.
4. Discuss the limitations of the chosen method and interpret the results with these precautions.
If strategy (a) is chosen, evaluate and discuss the share of the gender phenomenon captured by the individual gender variable(s) and the share of this variable(s) that captures non-gender related phenomena (variability of the measurement depending on other social characteristics for example).
If strategy (b) is chosen, evaluate and discuss the share of the gender phenomenon captured: is the population su ciently heterogeneous? Has it been possible to characterize the different sociocultural groups with the "social environment" variable? To what extent is the reference group still gendered?
These strategies must be understood within the de ned perspective of understanding the mechanisms of gender differences in health. Studies that focus speci cally on intersex, transgender, transsexual populations would require other approaches that are not described here. These strategies are also suited to a comprehensive and exploratory approach to the issue only: it seeks to explore the nature (biological or social) of observed differences between men and women when it does not seem trivial. The construction of a gender score seeks to capture a latent phenomenon but does not necessarily imply that the variables used for this score are the risk factors for the health outcome. No a priori assumptions are therefore made on the type of speci c exposures involved, which could be modi ed from a public health perspective. This objective could come in a second step, once the involvement of social mechanisms would have been identi ed.
Since the gender phenomenon is, as we have reiterated, contextual, the score constructed in a study on a speci c population cannot be directly transposed to another population. The use of this kind of score can lead to the conclusion that a difference in health between men and women is, at least partly, explained by social mechanisms. However, the estimated size of the effect of these mechanisms could not be generalized to other populations, since the ways in which gender pressure performed in these other populations are likely to differ and therefore to in uence health outcomes differently. The alternative approach, based on the study of sex-environment interaction, might more easily avoid the risk of essentializing differences, since it is precisely based on the variability of the involved processes. It may also make it possible to capture population phenomena by characterizing groups from the level of gender inequalities observed within them. These phenomena are otherwise di cult to capture with epidemiological methods based on individual-level variables, while it is a central aspect of understanding the gender process, based not only on the sex-differentiated norms prescription but also on their interrelated hierarchical relations.

Limitations
We argue that causal analysis strategies can guide us re ne our objectives, assumptions and conduct more rigorous analyses. It is from these methods that our approach has been built. However, this method has some drawbacks, including the need to de ne, for each factor, its "counterfactual". Firstly, this can give the impression that a binary categorization is unavoidable or that we are reinforcing it: if I was not born a female, it is because I was born a male; if I am not socialized as a woman, it is because I am socialized as a man. In practice, this may correspond to the way in which, in a gendered society based on a bipolarization of classi cations, people are actually exposed or not to a kind of socialization. But this binarity must not be essentialized. A continuous masculinity (or femininity) score could also be constructed., its counterfactual formulation would be: "being socialized in a very masculine way" and "being socialized in a less masculine way", without considering feminine socialization as the exact symmetry of this measure. Secondly and most importantly, de nition of counterfactuals is constrained by reality since models are estimated from what is observed. So, it will usually not be possible to compare sex health differences observed in a given population with sex health differences "that would be observed in a population that would not know the gender phenomenon", when only this comparison would really meet our objective. Similarly, the effect of a sex-independent masculine socialization will not be optimally measured, because, in our societies, moving from a female-socialization to a male-socialization for a female individual will never be equivalent to moving from a female-socialization to a male-socialization for a male individual. Perfect counterfactuals do not exist where gender is concerned.
Despite these limitations, approaches supported by causal analysis methods, could lead to more accurate and rigorous analyses of the mechanisms underlying health differences between men and women, and may ultimately limit the gender bias encountered in epidemiological and clinical research studies.

Figure 2
Effects of Sex and Gender

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download. Supplementarydata.docx