First and foremost, note that when sex assigned at birth was referred, the terms "male" and "female" are used. When referring to gender identity, the terms "man", "women" and "gender diverse" are used. However, for the sake of clarity, the terms "cisgender man" and "cisgender woman" have been used interchangeably with the terms man and woman.
2.1 Design and participants
This cross-sectional and quasi-experimental paradigm recruited N = 222 cisgender men, cisgender women, and gender diverse people (e.g., non-binary, gender fluid, genderqueer, transgender people) between the ages of 18 and 69 (M = 27.92; SE ± 8.97). Since sex hormones are a major component of this project, participants from a wide age range were recruited from 18 years of age and older. Given hormonal variations across the lifespan, this age diversity was chosen to help us investigate endocrine effects on cognitive abilities. Except age, this study presented no other exclusion criteria. We chose liberal criteria to maximize representation of people from sexual and gender diversity, who often experience more stigma and stress that can exacerbate health conditions (73). In accordance, factors that could have been used as exclusion criteria were treated as potential confounders. Participants were living in the greater Montreal area and needed to be fluent in either French or English.
Three virtual recruitment posters were developed to recruit from three populations: cisgender and heterosexual individuals, sexually diverse individuals (who were non-heterosexual), and gender diverse people. Recruitment was primarily done via Facebook posts on lesbian, gay, bisexual, transgender, and queer (LGBTQ+) community groups, university community groups, and partnership with LGBTQ+ organizations.
Sample descriptive statistics are summarized in Table 1 as a function of gender identity according to demographics, sociocultural gender variables, lifestyle behaviors, contraception and menstruation, and general health. Prior to conducting our study, we engaged in a participatory practice (74) with gender diverse communities. Our team conducted semi-structured qualitative interviews with 33 gender diverse people prior to testing. We identified health and wellness needs of this community and verified whether our research methodology and variables considered spoke to the concerns of this community that has been underrepresented.
(INSERT TABLE 1)
Table 1: Descriptive Statistics and Groups Differences
Characteristics
|
Sample
|
Cismen
|
Ciswomen
|
Gender diverse
|
p
|
N
|
222
|
82
|
74
|
66
|
|
Demographic
|
|
|
|
|
|
Age, M (SE)a
|
27.92 (8.98)
|
29.88 (10.88)c
|
26.49 (8.49)b
|
27.11 (6.13)
|
0.041
|
Race/ethnicity
White, n (%)
Black, n (%)
Asian, n (%)
Mixed, n (%)
Maghrebian, n (%)
Hispanic, n (%)
Indigenous, n (%)
Mother tongue
French, n (%)
English, n (%)
Bilingual, including French, n (%)
Others, n (%)
Occupational status
Workers, n (%)
Students, n (%)
Neither workers nor students, n (%)
Working hours/week, M (SE)
(only for workers)
Studying hours/week, M (SE)
(only for students)
Ratio of men/women at work/school
Gender diversity at work/school, % (SE)
Socioeconomics
Education, years in school, M (SE)
Civil status
Single, n (%)
In a relationship, n (%)
Married, n (%)
Divorced, n (%)
Missing, n (%)
Relationship preference
Monoamourous, n (%)
Polyamourous, n (%)
Missing, n (%)
Sex and Gender
Birth-assigned Sex
Male, n (%)
Female, n (%)
Sexual orientationh
Heterosexual, n (%)
Non-heterosexual, n (%)
Gender roles
Bem masculine gender roles, M (SE)i
Bem feminine gender roles, M (SE)
Bem neutral gender roles, M (SE)
Storms’ masculinity score, M (SE)j
Storms’ femininity score, M (SE)
Gender-affirming and hormonal therapy
Neither, n (%)
Hormonal Therapy (HT), n (%)
Gender-affirming surgery and HT, n (%)
Behavioral
Tobacco smoking
Smokers, n (%)
Social smokers, n (%)
Non-smokers, n (%)
Alcohol consumption, weekly
0, n (%)
1-6, n (%)
7 or more, n (%)
Cannabis consumption
None, n (%)
Occasionally (monthly or annually), n (%)
Regularly (daily or weekly), n (%)
Ilicit drug consumption
None, n (%)
Occasionally (monthly or annually), n (%)
Regularly (daily or weekly), n (%)
Contraception and menstruation
Postmenopausal, n (%)
Contraceptive use
None, n (%)
Contraceptive pill, n (%)
Hormonal IUD, n (%)
Copper IUD, n (%)
Other hormonal contraceptives, n (%)
General health
Medication use, n (%)
Neurological condition, n (%)
Cardiovascular condition, n (%)
General condition, n (%)
Psychiatric history
None, n (%)
Past or present history, n (%)
Family history, n (%)
Both past/present & family history, n (%)
Missing, n (%)
|
175 (78.8)
8 (3.6)
6 (2.7)
15 (6.8)
12 (5.4)
5 (2.3)
1 (0.5)
169 (76.1)
22 (9.9)
10 (4.5)
21 (9.5)
106 (47.7)
92 (41.4)
24 (10.8)
30.57 (15.55)
24.38 (13.66)
1.24 (1.52)
7.43 (7.83)
16.44 (2.71)
117 (52.7)
82 (36.9)
11 (5.0)
10 (4.5)
2 (0.9)
133 (59.9)
48 (21.6)
41 (18.5)
99 (44.6)
123 (55.4)
85 (38.3)
137 (61.7)
4.36 (0.78)
5.37 (0.70)
4.25 (0,51)
2.89 (0.89)
2.82 (0.92)
194 (87.4)
20 (9.0)
8 (3.6)
15 (6.8)
44 (19.8)
163 (73.4)
60 (27.0)
129 (58.1)
33 (14.9)
134 (60.4)
38 (17.1)
50 (22.5)
187 (84.2)
29 (13.1)
6 (2.7)
7 (3.2)
180 (81.1)
27 (12.2)
7 (3.2)
2 (0.9)
6 (2.7)
87 (39.2)
13 (5.9)
13 (5.9)
61 (27.5)
64 (28.8)
32 (14.4)
46 (20.7)
79 (35.6)
1 (0.5)
|
56 (68.3)e
6 (7.3)e
2 (2.4)e
9 (11.0)e
7 (8.5)e
2 (2.4)e
0 (0)e
66 (80.5)e,f
5 (6.1)e
3 (3.7)e
8 (9.8)e
45 (54.9)e
28 (34.1)e
9 (11.0)e,f
30.98 (16.39)
21.75 (14.73)
1.64 (1.70)c
5.66 (6.96)d
16.61 (3.00)
35 (42.7)e
37 (45.1)e
5 (6.1)e
5 (6.1)e
0 (0.0)
60 (73.2)e
16 (19.5)e
6 (7.3)
82 (100.0)e
0 (0.0)e
46 (56.1)e
36 (43,9)e
4.45 (0.79)
5.22 (0.74)c
4.37 (0.51)d
3.60 (0.66)c,d
2.07 (0.63)c,d
79 (96.3)e
3 (3.7)e
0 (0.0)e
8 (9.8)e
16 (19.5)e
58 (70.7)e
20 (24.4)e,f
44 (53.7)e
18 (22.0)e
45 (54.9)e
18 (22.0)e
19 (23.2)e,f
59 (72.0)e
18 (22.0)e
5 (6.1)e
0 (0.0)e
82 (100.0)e
0 (0.0)e
0 (0.0)e
0 (0.0)e
0 (0.0)e
23 (28.0)e
4 (4.9)e
2 (2.4)e
19 (23.2)e
33 (40.2)e
11 (13.4)e
20 (24.4)e
18 (22.0)e
0 (0.0)
|
66 (89.2)f
2 (2.7)e
1 (1.4)e
3 (4.1)e
2 (2.7)e
0 (0)e
0 (0)e
61 (82.4)f
3 (4.1)e
3 (4.1)e
7 (9.5)e
34 (45.9)e
37 (50.0)e
3 (4.1)f
31.79 (16.49)
24.14 (13.47)
0.93 (1.00)b
6.81 (7.21)d
15.89 (2.40)
44 (59.5)e
26 (35.1)e
2 (2.7)e
2 (2.7)e
0 (0.0)
50 (67.6)e
5 (6.8)e
19 (25.7)
0 (0.0)f
74 (100.0)f
36 (48.6)e
38 (51.4)e
4.38 (0.73)
5.51 (0.67)b
4.24 (0.51)
2.15 (0.70)b,d
3.63 (0.64)b,d
71 (95.9)e
3 (4.1)e
0 (0.0)e
4 (5.4)e
16 (21.6)e
54 (73.0)e
14 (18.9)f
49 (66.2)e
11 (14.9)e,f
56 (75.7)f
10 (13.5)e
8 (10.8)f
71 (95.9)f
3 (4.1)f
0 (0.0)e
2 (2.7)e,f
45 (60.8)f
20 (27.0)f
3 (4.1)e
2 (2.7)e
4 (5.4)e
23 (31.1)e
2 (2.7)e
4 (5.4)e
16 (21.6)e
25 (33.8)e
9 (12.2)e
19 (25.7)e
21 (28.4)e
0 (0.0)
|
53 (80.3)e,f
0 (0)e
3 (4.5)e
3 (4.5)e
3 (4.5)e
3 (4.5)e
1 (1.5)e
42 (63.6)e
14 (21.2)f
4 (6.1)e
6 (9.1)e
27 (40.9)e
27 (40.9)e
12 (18.2)e
28.33 (13.01)
27.44 (12.64)
1.12 (1.69)
10.34 (8.79)c,d
16.83 (2.59)
38 (57.6)e
19 (28.8)e
4 (6.1)e
3 (4.5)e
2 (3.0)
23 (34.8)f
27 (40.9)f
16 (24.2)
17 (25.8)g
49 (74.2)g
3 (4,5)f
63 (95,5)f
4.23 (0.83)
5.38 (0.65)
4.10 (0.47)b
2.85 (0.58)b,c
2.83 (0.66)b,c
44 (66.7)f
14 (21.2)f
8 (12.1)f
3 (4.5)e
12 (18.2)e
51 (77.3)e
26 (39.4)e
36 (54.5)e
4 (6.1)f
33 (50.0)e
10 (15.2)e
23 (34.8)e
57 (86.4)e,f
8 (12.1)e,f
1 (1.5)e
5 (7.6)f
53 (80.3)g
7 (10.6)g
4 (6.1)e
0 (0.0)e
2 (3.0)e
41 (62.1)f
7 (10.6)e
7 (10.6)e
26 (39.4)e
6 (9.1)f
12 (18.2)e
7 (10.6)e
40 (60.6)f
1 (1.5)
|
0.053
-
-
-
-
-
-
-
0.022
-
-
-
-
0.034
-
-
-
0.674
0.303
0.013
0.001
0.092
0.293
-
-
-
-
-
<0.001
-
-
-
<0.001
-
-
<0.001
0.228
0.033
0.005
<0.001
<0.001
<0.001
-
-
-
0.699
-
-
-
0.010
-
-
-
0.004
-
-
0.001
-
-
-
0.031
<0.001
-
-
-
-
-
0.124
0.107
0.034
<0.001
<0.001
-
-
-
-
-
|
a M = mean; SE = standard error.
b Significantly different from cisgender men after post-hoc comparisons using the Tukey HSD test.
c Significantly different from cisgender women after post-hoc comparisons using the Tukey HSD test.
d Significantly different from gender diverse people after post-hoc comparisons using the Tukey HSD test.
e,f,g Homogeneous subsets after chi-squared tests and comparison of column’s proportion; using Bonferroni’s correction.
h Sexual orientation was assessed using the Kinsey Scale, including 1 (exclusively heterosexual), 2 (Predominantly heterosexual, only incidentally homosexual), 3 (Predominantly heterosexual, more than incidentally homosexual ), 4 (bisexual or pansexual), 5 (predominantly homosexual, more than only incidentally heterosexual), 6 (predominantly homosexual, only incidentally heterosexual ), 7 (exclusively homosexual) and 8 (asexual spectrum). If participant identified as 1 or 2, they were classified as heterosexual. If participant identified as 3, 4, 5, 6, 7 or 8, they were classified as non-heterosexual.
Participants (N = 222) were divided into three groups and then some into five sub-groups (see Figure 1). The first group (n = 82) was divided in two sub-groups respectively composed of (1) heterosexual cisgender men (n = 46) and (2) heterosexual cisgender women (n = 36). The second group was composed of people representing sexual diversity (people who do not identify themselves as only heterosexual) and was divided in two sub-groups: (1) cisgender non-heterosexual men (n = 36) and (2) cisgender non-heterosexual women (n = 38). The third and last group was composed of people representing gender diversity (e.g., trans men, trans women, non-binary, gender fluid, queers, and others.; n = 66). These groups have been separated according to the main variables of interest (e.g., birth-assigned sex, gender identity and sexual orientation) in Figure 1.
2.2 Procedures
This study is based on a published protocol paper (Kheloui et al., 202175). Our study was approved by the ethics committee of the Montreal Mental Health University Institute. Interested participants contacted our research team. Following a ten-minute telephone screening interview, participants set an appointment at the Center on Sex*Gender, Allostasis, and Resilience (CESAR) based at the Research Center of the Montreal Mental Health University Institute. This study required one visit lasting between 110 to 150 minutes (M = 128.18, SD = 21.30) during which collection of biopsychosocial variables was conducted (see Figure 2). Visits were scheduled during the afternoon, between 12AM and 5PM to control for circadian variations in basal cortisol (M = 14:47 hrs, SD = 88.16 minutes).
Participants were provided with all necessary information regarding the protocol at the start of the session. Trained testers reiterated that all data would be kept in complete confidentiality. Upon consent, a first saliva sample was obtained to measure levels of sex hormones, cortisol, and dehydroepiandrosterone. Two more samples were obtained: one after the 5th cognitive task (mid-way) and another after the 8th and last cognitive task. Finally, participants completed an online platform called Qualtrics. Well-validated questionnaires assessed gender identity, gender roles, and sexual orientation, as well as socioeconomic status, race/ethnicity, menstruation, contraceptive, substance use , medications, and physical and mental health, which can influence performance on cognitive tasks. Participants were compensated $50 (see Figure 2).
2.3 Measures
2.3.1 Biological measures
Participants were asked to produce between 2mL to 3mL of saliva in a tube (Salivettes) assisted with a thick straw. A total of 3 saliva samples were taken at specific time at the beginning, middle, and end of the testing session (see Figure 2). All three samples were immediately transported into an industrial freezer of our Research Center by our staff where they were kept frozen at -20°C until analyses.
Sterilized 3mL 12.5×71 mm screw cap tubes (VWR®, Item No. 10018-762) were used to collect saliva. In preparation for analyzes, frozen samples were thawed to room temperature and centrifuged at 1500 × g for 15 minutes. High-sensitivity enzymeimmune assays was used for cortisol (Salimetrics®, No. 1-3002, sensitivity: 0.012–3 mg/dl), estradiol (Salimetrics®, No. 1-3702, sensitivity: 1–32 pg/ml), progesterone (Salimetrics®, No.1-1502, sensitivity: 5pg/ml) and DHEA (Salimetrics®, No.1-1202, sensitivity: 5 pg/mL). Testosterone was determined by expanded-range enzymeimmune assay (Salimetrics®, No. 1-2402, sensitivity: 1 pg/ml). Inter- and intra-assay coefficients of variance were determined for all 5 hormones. Assays will then be duplicated and averaged.
Since cortisol and testosterone showed decrements in concentrations throughout the completion of the protocol in preliminary analyses, we included three time-measures to assess baseline and changes over time. However, for estradiol, progesterone, and DHEA, only the second saliva sample, taken shortly after the first hour was included. This decision was made following preliminary analyses with the first ten participants, for whom time effects were observed for testosterone and cortisol, but not for the other biomarkers of interest.
Even though this study does not constitute a stress paradigm, circulating cortisol and DHEA concentrations can impact cognitive abilities (76, 77). Moreover, cortisol activity can influence sex hormone secretion and should be considered in studies aiming to better understanding sex hormone effects on cognition (43). For the purpose of main analyses, cortisol and DHEA were combined. The ratio between circulating cortisol and DHEA ratio is considered more accurate and physiological reflection of net cortisol activity (78).
2.3.2 Biological confounders of sex hormones
Several potential confounding variables of our biological measures were also considered. Indeed, hormonal contraceptive use was ascertained as well as the presence of hormonal therapy. For analysis purposes, presence of hormonal contraceptive was indexed as a dichotomic variable (0 = absence of hormonal contraceptive, 1 = presence of hormonal contraceptive). Hormonal therapy history, on the other hand, was indexed as a continuum (0 = no hormonal therapy, 1 = hormonal therapy, 2 = gender-affirming surgery & hormonal therapy). Finally, the list of medications taken was requested to control for prescriptions that could modify the secretion and synthesis of sex hormones.
2.3.3 Cognitive measures
Performance on the cognitive tasks presented next are the main dependent variables for this study. This battery of cognitive tests covers several neuropsychological functions, for which the majority present a sexual polymorphism in their respective performance (Kheloui et al., 202175). Among the eight tasks that composed this battery, three of them showed better performance for men while three others showed better performance for women. Two selected tasks presented no significant sex difference in performance.
2.3.3.1 “Male/men-typed” tasks
Mental rotation skills were measured using the Shepard and Meltzer Mental Rotation task (79). Twenty pairs of objects were presented, all composed of three-dimensionally drawn blocks, to which the participants had to mentally rotate and indicate if they were the same or different. Scores could range from 0 to 20 and participants were given a three-minute limit. The reaction time of each item was reported as additional data. Sex differences have been well documented (80-82) in mental rotation with men outperforming women.
Visuospatial judgement was measured using the 30-item Benton Judgement of Line Orientation task (JLO) (30). Participants were given a booklet containing 5 practice-items, followed by 30 test-items. Each item consisted of two unnumbered angled lines. The task was to indicate the two numbers that matched the 11 numbered lines of a reference card. Scores could range from 0 to 30. Better performance has been observed among men in comparison to women (31, 32).
The Rey-Osterrieth Complex Figure test (ROCF) measures spatial memory alongside visuospatial constructional ability (83, 84). While sex differences have been reported in this task (where men outperform women), some studies reported low effect sizes (85, 86). The task was carried out in three phases, starting with the copy of the figure, without a time limit. Once completed, the experimenter left the room, leaving the participant alone for 3 minutes. The second phase began as soon as the experimenter returned, where the figure had to be redrawn from memory, without a time limit. The last phase of the task occurred later in the protocol, about 40 minutes after the second, during which the figure was redrawn a second and last time from memory. Scores varied from 0 to 36 (between 0 and 2 points were allocated for each 18 items, based on exactitude and location). Immediate and delayed recall scores are frequently used together to observe consolidation in long-term memory (87, 88). Nevertheless, given the strong correlation between those two measures (r = 0.960), we averaged both scores (see Table 2).
2.3.3.2 “Female/women-typed” tasks
Verbal memory is a cognitive domain for which sex differences have been observed (20). This neuropsychological function was measured using the California Verbal Learning Test Second-Edition (CVLT-II) (89). Studies underline sex differences in CVLT, where women’s performance is generally better than men’s. The completion of this task took about 15 minutes and took place in two phases. Participants were asked to memorize and recall a first list of 16 words read out loud by the experimenter. This short sequence was repeated five times for the same list, giving a score ranging from 0 to 80 (5 list-recall of 16 words each). This section was followed by a similar exercise of memorizing and recalling a second list of 16 words, giving a score from 0 to 16. Participants were asked to continue this task by listing the most words of the first list as they remembered directly after recalling the second list, and another time 30 minutes after, each getting a score from 0 to 16. For simplicity, the measures used for analyses were only the sum of trial 1 to 5 (ranging from 0 to 80). This decision was based on the followed premise: this measure is the most reported measure in studies using the California Verbal Learning Test and it provides a reliable index of verbal learning and verbal memory (90, 91).
Semantic verbal fluency was measured using the Controlled Word Association task (92). Participants were asked to generate as many words as they could from a certain category. Animals, fruits, and vegetables were the ones chosen and one minute was the time allowed for each of these. Scores of this task were determined by adding the total of correct words generated by all three categories. Studies have shown that women outperform men in verbal fluency (93).
The Purdue Pegboard task measures motor skills (94). Sex differences have been observed using this task with women performing better than men (95, 96). The material for this task consists of a board with two parallel rows of 25 equidistant holes and several dozen pieces of three types. The task involves performing four different manipulations with these pieces: one only with the right hand, one with the left hand, one with both hands at the same time, and one where the three pieces were alternated to form an assembly. This cycle of four manipulations was executed three times. Scores of the first three manipulations were defined by the number of pieces placed into the board after 30 seconds. The sum of these manipulations was calculated and averaged around the three trials. Assembly scores were determined by multiplying by four the amount of complete structures built, over a maximum of 60 seconds. These scores were also averaged around the three trials. In the same way as the Rey-Osterrieth Complex Figure, two scores were generated from this task. Given the high correlation between these two scores (r = 0.577), we averaged both scores (see Table 2).
2.3.3.3 “Neutral-typed” tasks
The two last tasks incorporated in the protocol either showed no significant sex difference or showed a sexual polymorphism that was too inconsistent across different studies. The addition of these tasks in a protocol were as control conditions with tasks where no sex differences was expected.
The Digit Span task was chosen as the first “ice-breaker” task between participants and testers (97). Immediate memory was the neuropsychological function measured that takes approximately 10 minutes to complete. People were asked to recall the sequence of numbers named by the experimenter in the correct order. The task started with lists of two digits and progressed to lists of ten digits. The score on this task was summarized by the number of digits in the longest successful sequence.
The Five-Point Test was the last task of this protocol and measured figural fluency functions (98). No sex differences were found even after many studies developing norms on numerous subpopulations. Performance varies significantly according to education level and age (99). Completion of this task lasts two minutes and is done on a page with 35 identical squares with 5 dots. Participants had to make as many unique drawings as possible, using only straight lines. Scores were integers from 0 to 35, according to the number of correct and unique drawings.
2.3.4 Sociodemographic and psychosocial measures
2.3.4.1 Birth-assigned sex and gender identity
Birth-assigned sex and gender identity were measured using an adapted version of a scale developed by Bauer (100). This questionnaire measures birth-assigned sex with one item and gender identity with two items. The two gender identity items assess the gender identity that the person identifies with the most and lives as in their day-to-day life.
2.3.4.2 Characteristic gender roles
Gender role were addressed using the Bem Sex Role Inventory – Short-Form (101, 102). This questionnaire presented 30 gender-stereotyped traits to which participants evaluated their presence on a 7-point Likert scale (1 = never or almost never true, to 7 = always or almost always true). 10 items were respectively considered masculine and feminine, alongside 10 items that were considered neutral and which measure social desirability. This short version of the BSRI presented a 0.90 correlation with its original version, published 7 years earlier (102, 103). Internal consistency for this scale was measured for the 10 masculine/feminine items for each of our three gender identity groups: cisgender men, cisgender women and gender diverse. Masculinity showed acceptable Cronbach alpha’s (cisgender men: α = 0.79; cisgender women: α = 0.76; gender diverse: α = 0.79). Similarly, femininity showed sufficient Cronbach alpha’s (cisgender men: α = 0.79; cisgender women: α = 0.84; gender diverse: α = 0.74).
2.3.4.3 Sexual orientation
Sexual orientation was assessed using a modified Kinsey scale (104). This classic scale provides a dimensional measure over and above homosexuality-heterosexuality categorical responses. The scale includes measures ranging from 0 (exclusively heterosexual) to 6 (exclusively homosexual). In addition to these seven measures, we have added a score (7) to the scale, including along people identifying on the asexuality spectrum. Moreover, people identifying as pansexual were attributed the same score as bisexual individuals. This type of measure will allow analyses on a dimensional level (0 to 6) and on a categorical level (heterosexual and non-heterosexual). Scores from 0 to 1 will form the "heterosexual" category, while scores from 2 to 7 will form the "non-heterosexual" one.
2.3.4.4 Drug and alcohol use
A three-item short screening questionnaire was designed to measure alcohol, illicit drug (i.e., cocaine, ecstasy, amphetamines) and cannabis consumption. The average number of alcoholic beverages consumed per week was asked. Three levels were defined: 1) no alcohol consumption, 2) between one and six beverages a week, and 3) seven or more beverages a week. These categories were chosen according to the Canadian Guidelines for alcohol use disorder (105). Similarly, the profile of illicit drug consumption was assessed through a three-level scale: 1) no illicit drug consumption, 2) monthly or annually consumption, and 3) daily or weekly consumption. Finally, cannabis consumption was defined with the same three-level scale as used for the illicit drug consumption. For statistical analysis purposes, the three consumption behaviours were combined and indexed as follow: the sum of the scores of the three scales (alcohol, cannabis, and illicit drugs), each having three levels, ranging from 0 (no consumption) to 2 (regular consumption). This said, score for this index went from 0 to 9.
2.3.4.5 Physical and mental health
Physical general health was assessed with a screening questionnaire. Participants had to indicate which medical illness from the conditions listed applied to their profile in a three-part questionnaire: cardiovascular conditions (e.g., heart attack, hypo/hypertension, and more), neurological conditions (e.g., stroke, epilepsy, and more) and general conditions (e.g., diabetes, sexually transmitted diseases, and more).
Furthermore, mental health was assessed according to different psychiatric condition (e.g., depression, bipolar disorder, schizophrenia, and more), to which they indicated if the diagnosis applied to their profile, and if it was a past or present diagnosis. The same questions were asked again for immediate family members (mother, father, brother, sister). Following a similar indexing manner as the one for substance use , a physical & mental health index was created. This one went from 0 to 5 and was the sum of 5 dichotomic scores (if the participant took medication, had a neurological, cardiovascular, or general health condition, and had a psychiatric history).
2.4 Statistical analyses
Statistical analyses were performed using the Statistical Package for the Social Sciences (SPSS) Version 28 software. An a priori power analysis was conducted using G*Power version 3.1.9.4 to determine the minimum sample size required. To detect sex effects using 9 factors and 5 covariates, while explaining 10% of the variance (19) at 80% power, at a significance criterion of α = 0.05, a sample size of N = 196 was needed. With an addition of two a posteriori covariates, a minimum sample size of N = 207 was required. Our final sample size of N = 222 was therefore adequate.
Preliminary analyses assessed demographics, substance useand general physical and mental health using analyses of variance (ANOVA) or χ2 (according to the categorical or continuous nature of each variable), as a function of gender identity (see Table 1). Post hoc analyses used Tukey’s test. Two correlation matrices using Pearson correlations were produced: one to describe multiple associations between cognitive tasks (Table 2), and one to describe multiple associations between sex and gender variables of interest (Table 3).
Main analyses were organized in two parts according to our two hypotheses. First, analyses of variance (ANOVA) were conducted to ascertain birth-assigned sex differences for the eight cognitive tasks. Significance was set at α = 0.05 and effect sizes are reported as partial eta squared (η2P). Effect sizes can be interpreted as a small effect (η2P ≅ 0.01), medium effect (η2P ≅ 0.06), or large effect (η2P ≅ 0.14) (106). We provided a conversion based on Cohen’s d to facilitate interpretation in the discussion section because the SPC literature mainly uses this measure of effect size. Cohen’s d can be interpreted as a small effect (d ≅ 0.2), medium effect (d ≅ 0.5), or large effect (d ≅ 0.8) (106). Second, multiple hierarchical regressions were performed, with the aim of including the five sex and gender factors previously mentioned in sequence. Hierarchical blocks were designed as followed: (1) birth-assigned sex (coded as male = 0 and female = 1), (2) sex hormones (testosterone, estradiol and progesterone), (3) gender identity dummy variables as women (coded women as referent = 0 and those that do not identify as a women = 1) and as gender diverse (coded gender diverse as referent = 0 and cisgender = 1), (4) gender roles (masculinity and femininity scale), (5) sexual orientation (coded as heterosexual = 0 and non-heterosexual = 1), and finally (6) covariates add last.
Covariates were selected a priori based on the literature showing that age (107, 108), language (109), hormone-replacement therapy (110, 111), DHEA/cortisol ratio (112), and contraceptive use (113) have impacts on the cognitive abilities. Based on preliminary analyses of group differences, further a posteriori covariates were selected. These included alcohol and drug use (114, 115) and physical and mental health conditions (116, 117). ANOVA and regressions’ assumption of score independence, normality, linearity, and homoscedasticity were respected following recommendations (118). Independent variables also met assumptions of collinearity according to variance inflation factor (VIF) test: VIF [1.079, 3.708] (119). Each variable included in this model were presented in Table 3.