Development of the Paternal Brain in Humans throughout Pregnancy

Abstract Previous studies have demonstrated that paternal caregiving behaviors are reliant on neural pathways similar to those supporting maternal care. Interestingly, a greater variability exists in parental phenotypes in men than in women among individuals and mammalian species. However, less is known about when or how such variability emerges in men. We investigated the longitudinal changes in the neural, hormonal, and psychological bases of expression of paternal caregiving in humans throughout pregnancy and the first 4 months of the postnatal period. We measured oxytocin and testosterone, paternity-related psychological traits, and neural response to infant-interaction videos using fMRI in first-time fathers and childless men at three time points (early to mid-pregnancy, late pregnancy, and postnatal). We found that paternal-specific brain activity in prefrontal areas distinctly develops during middle-to-late pregnancy and is enhanced in the postnatal period. In addition, among fathers, the timing of the development of prefrontal brain activity was associated with specific parenting phenotypes.


INTRODUCTION
Humans are biparental mammals, that is, men and women typically work together to care for their offspring. Whereas the importance of maternal care is often taken for granted, the same cannot always be said about paternal care. Yet, many studies into paternal behaviors, both in animal models and in humans, have shown that the role of fathers is paramount for the optimal development of the infant: An increased involvement of fathers in childrearing augments the mental health and social adaptation of both the child and mother (Lamb, 2010;Feldman, 2000;Williams & Radin, 1999;Leventhal, 1983). With their typically "rougher" (Stgeorge & Freeman, 2017), more stimulating (Feldman, Gordon, Schneiderman, Weisman, & Zagoory-Sharon, 2010), and overall more physically oriented parental style (Braun & Champagne, 2014), fathers engage their infants in ways that are fundamentally distinct from the stimulation from maternal care (Lamb, 2010). This high-arousal, physically focused stimulation helps develop the infant's emotional and cognitive control ( Wilson & Prior, 2011;Flanders, Leo, Paquette, Pihl, & Séguin, 2009;Feldman, 2003). Conversely, absence of paternal care is related to a wide range of developmental and psychological problems in the offspring, including increased developmental risks, maladaptive behavior during adolescence, and even lower adult psychological outcomes (Braun & Champagne, 2014;Grossmann et al., 2002;Phares & Compas, 1992). Thus, an involved paternal care is necessary for the proper development of the child.
However, beyond maternal care, paternal caregiving shows considerable variability throughout history and across cultures. Many fathers directly engage in childrearing activities, whereas others hardly do so (Feldman, Braun, & Champagne, 2019;Shackelford & Salmon, 2011;Kramer, 2010;Lamb, 1987). Thus, understanding the neurobiological factors that modulate such observed variability in parental behaviors in fathers is of great importance given that paternal caregiving influences the development of the offspring (Sarkadi, Kristiansson, Oberklaid, & Bremberg, 2008).
Importantly, the parental brain is not exclusive to parents. Both nonfathers and fathers display neural activation in these networks in response to infant-interaction stimuli (Diaz-Rojas et al., 2021;Seifritz et al., 2003). However, the fact that differences exist in the patterns of activation between the two groups is more important. Among fathers, the patterns of activation are influenced by active caregiving experiences; thus, primary caregiving fathers (i.e., stay-at-home fathers) exhibited increased activation of the emotional processing network compared with that of secondary caregiving fathers (Abraham et al., 2014). In addition, from an anatomical standpoint, Kim et al. (2014) found that a postnatal decrease in gray matter volume in the orbitofrontal cortex was associated with high levels of father-infant interaction among first-time fathers. Thus, plastic changes at the functional and structural levels in the human parental brain are related to caregiving experiences with infants.
However, less is known about when such paternal phenotypical and biological changes occur. Thus far, very few studies have focused on the development of the human paternal brain from early pregnancy to postnatal period (Provenzi et al., 2021). Recently, in a cross-sectional study, Diaz-Rojas et al. (2021) already found differences in early to mid-pregnancy (gestational age [GA] < 30 weeks) expectant fathers and control nonfather men. Using multivoxel pattern analysis (MVPA), they identified subtle differences in areas of the parental brain between nonfathers and first-time fathers in response to infant-interaction stimuli. In particular, differences were found in the mentalizing areas between fathers with and without previous experience of infant caregiving (i.e., infants apart from their own). Considering these findings, it is possible that the variability in parental brain and phenotypes begin forming during the pregnancy period, long before the father acquires active caregiving experiences with his child. Another possibility is that these different patterns of neural activation between expectant fathers in early pregnancy and childless men are already present even before the pregnancy starts (e.g., these differences could affect or be affected by men's desire for siring offspring). Because of the cross-sectional nature of Diaz-Rojas and colleagues' work, it is not possible to rule out this possibility. Further continual and longitudinal studies are necessary to uncover the changes of the parental brain as early as possible in the progress of pregnancy, and to find the factors explaining the diverse trajectories of the development of paternal phenotypes.
In addition to active caregiving experiences, other factors may modulate such differences. For example, although men are not exposed to radical biological changes compared with those of women during pregnancy and the postnatal period, evidence exists that their hormonal profiles change during this period (Edelstein et al., 2017;Saxbe, Schetter, Simon, Adam, & Shalowitz, 2017;Berg & Wynne-Edwards, 2001). Two of these hormones are oxytocin and testosterone, which play an important role in paternal behavior ( Weisman, Zagoory-Sharon, & Feldman, 2014;Gordon, Zagoory-Sharon, Leckman, & Feldman, 2010). Low levels of testosterone throughout pregnancy and during the postnatal period in fathers were associated with high levels of paternal involvement in caregiving (Kuo et al., 2018;Weisman et al., 2014). Moreover, high levels of postnatal oxytocin were related to increased father-infant interaction . In this manner, these hormonal changes may modulate the variability in parental phenotypes among first-time fathers even during pregnancy.
To clarify the dynamism of paternal brain developmentthat is, the functional changes occurring over the pregnancy period that can be associated with emerging traits of fatherhood-and its relation to other parental factors, such as behavior, traits, and hormones, continually examining the evolution of the parental brain and factors is necessary from the early period of the partner's pregnancy until the postnatal period. In this regard, the current study investigated longitudinal changes in the neural, hormonal, and psychological bases of the expression of human paternal caregiving across pregnancy and after childbirth. Using fMRI, we recorded neural activation in the parental brain in response to infant-interaction stimuli among first-time fathers. The study employed three time points, namely, early to mid-pregnancy (GA: 10-30 weeks), late pregnancy (GA: >30 weeks), and in the first 4 months of the postnatal period. We also examined the abovementioned changes among childless men as the control group (CG) during the same time periods to serve as a baseline. Moreover, we observed the levels of two parenting-related hormones, namely, oxytocin and testosterone, and paternity-related psychological and behavioral traits for each time point. We first investigated changes in terms of group differences (first-time fathers and childless men) to reveal whether and how the neural networks of the parenting brain of first-time fathers change during pregnancy. Next, we compared the hormonal, psychological, and behavioral profiles of firsttime fathers to investigate which of these factors are related to the development of the paternal brain. Finally, we compared differences in the timing of neural changes in expectant fathers to examine the possible factors that influence the paternal brain from various perspectives.
We demonstrate that across pregnancy, first-time fathers experience a major change in their neural response to infant-interaction stimuli, especially in the mentalizing network compared with control across the same time periods. Moreover, activations in several areas of the paternal brain among postnatal, but not prenatal, fathers are related to emotional attachment to their infants. Finally, the timing of changes in the mentalizing network is related to several phenotypes, such as a father's feeling of attachment toward his own infant and positive outlook toward parenting.

Ethical Considerations
The ethics committee of the Kyoto University Unit for Advanced Study of Mind (30-P-8) approved the experimental protocol and procedures. All participants provided written informed consent before the study and were remunerated equally for their participation after each session.

Participants
We recruited 36 first-time expectant fathers for this study ("papa" group [PG]: mean age = 33.1 ± 5.7 years; mean GA of partner = 20.9 ± 6.2 weeks). This number was based on previous studies using similar sample sizes (Provenzi et al., 2021). We also recruited additional 36 men without children (CG: mean age = 31.1 ± 6.2 years), and who reported no interest in becoming fathers in the short term (within 1 year). The purpose of this CG was to have a baseline of the neurobiogical and behavioral characteristics of men, and how these change upon fatherhood. Both fathers and control men belonged to all socioeconomic status (SES) groups. Approximately 4 months later, the same men were invited for a second (late-pregnancy session) and a third (after-birth) session 4 months thereafter. The mean intervals (in days) between first and late-pregnancy sessions were 99.18 ± 36.24 and 125.56 ± 23.53 for PG and CG, respectively. The mean intervals between the first and after-birth sessions were 250.94 ± 38.25 and 250.00 ± 27.44 for PG and CG, respectively. In the PG, the mean interval between child birth and the third session was 119.7 ± 20.3 days. Data for one PG participant at Session 1 was considered as late pregnancy, because data were obtained when the participant's partner GA was 39 weeks (i.e., the participant was missing the data for Session 1; interval between recordings = 132 days). One participant in PG did not participate in the third session (after-birth). Moreover, we excluded one PG participant from all analyses because of the lack of a late-pregnancy session and participation in the after-birth session approximately 9 months after birth (250 days), which was considerably later than the rest of the fathers. In the CG, one participant was excluded from all analyses because of the inability to complete the fMRI recording session. Another participant was excluded because of only completing the first session of recording. Finally, data for one participant in the CG during the third session could not be recorded because of machine error. Moreover, we discarded the fMRI data of one PG participant (first session) and one CG participant (second session) because of high movement artifacts. The final sample is as follows: for the first session (early pregnancy), PG = 33, CG = 34; for the second session (late-pregnancy), PG = 35, CG = 33; and for the third session (after-birth), PG = 34, CG = 33.
For the clustering analysis (see subsection "Exploratory Analysis of the Individual Differences in the Paternal Brain" below), only participants with three data points were considered. For this analysis, the participant number is 32 participants for PG, and 32 participants for CG.
All participants from the PG and six from the CG were married or living with partners. Their marital status remained the same across the duration of the project. All participants were of East Asian ethnicity ( Japanese: n = 70; Korean: n = 2), had normal or corrected-to-normal vision, and were all right-handed. No participants reported any major medical illnesses, major psychiatric disorders, or neurological illnesses.

Behavioral Characteristics
For each recording session, we collected the following demographic data: (a) SES, including household income, educational background and history, and number of family members in the household, and (b) weekly work/study time.
In addition, we collected the following psychological characteristics: (c) State-Trait Anxiety Index using the State-Trait Anxiety Inventory (STAI-Trait; Spielberger, 2010) and (d) depression index using Beck's Depression Inventory (BDI; Zich, Attkisson, & Greenfield, 1990). These two psychological traits have been implicated in the heterogeneity of parenting behaviors in mothers and fathers ( Wee, Skouteris, Pier, Richardson, & Milgrom, 2011), as demonstrated by a decrease in positive parenting style and an increase in negative parenting style, respectively ( Wilson & Durbin, 2010).
Furthermore, we considered the following behavioral data related to family and parenting: (e) positive and (f ) negative attitudes toward parenting (Inori & Kato, 2011), including 12 items for a positive image (e.g., parenting involves pleasure) and 15 items for a negative image (e.g., parenting is difficult). Each item was scored using a 5-point Likert scale. Next, we measured (g) partner relationships using the Dyadic Adjustment Scale (Spanier, 1976;DAS), which was only answered by married participants and those living with romantic partners, with 32 items for assessing marital relationship quality. (h) Fetalpaternal attachment (The Paternal Antenatal Attachment Scale [Condon, 1993] was used only for PG participants) was used to measure the strength of attachment using 16 items. Each item was rated using a 5-point Liker-type scale. The scales from (e) to (h) provided scores consistent with those of previous studies. Ceiling/floor effects were observed for several items on the (e), (f ), and (h) scales, which were defined by whether the mean score of ±1 SD exceeded the range of possible values. Items indicating a ceiling/floor effect were excluded, and mean scores were used for analysis. To compare the prenatal fetal-paternal attachment and postnatal infant attachment, (h) scores were normalized to the 0-1 range. For the (g) scale (DAS), two items in the scale were rated as 0 or 1, which is in contrast with the rest (i.e., using a 5-, 6-, or 7-point Likert-type scale). Thus, these two items (i.e., "feeling of exhaustion when thinking about intercourse with partner" and "expressions of affection") were excluded from the overall DAS score to avoid uneven effects.
Finally, we performed additional surveys only in postnatal fathers to determine paternal behavior. We measured fathers' development and postnatal infant attachment (normalized to the 0-1 range), paternal involvement in caregiving, time together, and times at papa school. Fathers' development, which is based on Morishita (2006), measures psychological changes in men through the experiences of parenting (e.g., considering the feelings of parents with children) with 20 items rated using a 5-point Likert-type scale. The postnatal infant attachment scale (Condon, Corkindale, & Boyce, 2008) was used to measure the strength of attachment with 19 items, which were rated using a 5-point Likert-type scale. Paternal involvement in caregiving was developed based on Abraham et al. (2014), which measures the frequency of fathers' performance of 15 daily parenting activities (e.g., bathing children, changing diapers, and feeding babies). Each item was rated using a 5-point Likert scale. We also recorded the time that the participants spent alone with infants in hours per week and the number of times the participants attended parent-child classes at hospitals or municipalities during their partners' pregnancy. These frequencies were respectively designated as time together and times at papa school. Moreover, we measured the degree of household chores (e.g., cooking meals, doing the laundry, and cleaning the house) performed, which was developed based on Volling and Belsky (1992) and Edelstein et al. (2017). Eight items were scored on a 5-point Likert-type scale ranging from 1 (always the wife) to 5 (always the husband ). Table 1 provides a summary of the behavioral data. Statistical differences between groups and sessions were assessed using a n-way ANOVA. Several behavioral traits displayed group differences, such as trait anxiety (F = 11.8, p = .01, df = 1), weekly worktime (F = 115.7. p < .001, df = 1), household income (F = 133.8, p < .001, df = 1), and positive attitude toward parenting (F = 24.7, p < .001, df = 1). Despite these group differences, controlling for any of the covariates did not significantly change the differences in neural activation across the abovementioned sessions. Several traits exhibited a large correlation with one another (Figure 1). To reduce the number of tests performed across the study and to avoid confusion in our conclusions because of similarity among scores, we excluded a few of the measurements. In particular, fathers' development displayed a high correlation with several other parent-related traits, such as postnatal infant attachment and positive attitude toward parenting and, to a lesser extent, time together and parental involvement. For this reason, we excluded this trait from analyses, as its inclusion would not provide any useful insights for the study objectives. The depression score, BDI, pointed to a large correlation with trait anxiety. Thus, it was also excluded from the analysis. State anxiety was likewise excluded, because the measurement itself provides no relevant information regarding parenting. Instead, it reflects a number of factors, such as postrecording stress or anxiety and may disrupt our analysis.

Data Collection and Analysis for Hormonal Levels
The participants rinsed their mouths before saliva collection and chewed on an oral cotton swab after 10 min (Salimetrics, State College), which was placed sublingually for 3 min. This process was repeated to obtain a backup sample. The first samples were used for the calculation of the hormonal levels of each participant. If the first sample contained insufficient saliva specimen or if the hormonal levels of the first sample were outliers, then the value of the second sample was used as the individual value. The samples were stored at −80°C until assayed. Salivary oxytocin was assayed using a commercial kit (ADI-900-153A-0001; ENZO Co. Ltd.) following the manufacturer's protocol. In summary, 240 μL of saliva was dried using a Speedvac evaporator at room temperature for 3 hr and reconstituted in 240 μL of an assay buffer out of which 100 μL was used for the assay. Salivary testosterone was measured by ELISA following our previous reports (Aoki, Shimozuru, Kikusui, Takeuchi, & Mori, 2010;Shimozuru, Kikusui, Takeuchi, & Mori, 2008). Furthermore, 25 μL of saliva was used in the assay using testosterone-3-CMO-HRP (FKA101; COSMO Bio) and a specific antitestosterone serum (FKA102-E; COSMO Bio). All intra-and inter-assay coefficients of variation were less than 15%. We could not detect oxytocin from the samples of two participants from the PG and two participants from the CG. As such, these participants were excluded from the oxytocin analysis.
For the analysis of hormonal data, we addressed three nuisance effects, namely, time of day at measurement, seasonal changes, and storage time. The first two refer to the fact that levels of oxytocin and testosterone vary across the day and across seasons. To minimize the first confounding effect, all experiments were scheduled at morning between 8:45 a.m. and 10:45 a.m. However, minimizing the seasonal effect at the experimental level is complex. The final nuisance effect, which may influence data quality, is the time of storage of the saliva samples before processing. Although we intended to analyze the data as soon as possible, we cannot overlook the possibility that a certain level of nuisance effect was included. Thus, to remove the three nuisance effects, we created individual linear models for each hormone using the three parameters and the raw hormonal measurement. Lastly, we obtained the residuals of the model. Across this work,  the residuals were used as the values for the hormones instead of raw data. Statistical differences between groups and sessions were assessed using a n-way ANOVA (Table 1). For PG and CG from the first session to the second and third sessions, we observed a significant decline in oxytocin levels (F = 9.15, p = .002) and a marginally significant increase in testosterone levels (F = 5.21, p = .09). However, when residualizing according to the time of day at acquisition, day of the year, and time of storage before processing, these effects disappeared (session effect on testosterone: F = 2.51, p = 1) or became weaker (session effect on oxytocin: F = 7.5, p = .01).

Task Protocol
We followed the protocol outlined in Diaz-Rojas et al. (2021) for fMRI data acquisition. In summary, inside the MRI scanner, participants were shown silent videos of a male model performing actions from the first-person perspective with a duration of 30 sec ( Figure 2) and were instructed to observe the movements of the model's hands. The videos were two infant-interaction videos (S1: playing with an infant; S2: changing diapers) and two control videos without infant interaction (C1: opening a box and removing a tripod from it; C2: wrapping a box with plastic). The male model and the infant were a Top triangular elements of each matrix refer to data from the CG, whereas the lower triangular elements refer to data from the father's group (PG). p Values for each coefficient were calculated, and those with p > .001 were zeroed out.
father-child dyad of East Asian ethnicity and were unrelated to any of the participants. The control videos were performed by the same male model and were done to roughly match the movements in the corresponding infant-interaction videos. In the infant-interaction videos, the frame was set to exclude the infant's eyes, because previous studies demonstrated that the facial features of infants may evoke different responses in human men (i.e., own vs. other child [Mascaro, Hackett, & Rilling, 2013;Kuo, Carp, Light, & Grewen, 2012] because of the increased response because of facial resemblance [Platek, Keenan, & Mohamed, 2005;Platek et al., 2004]). Therefore, we opted to minimize the confounding effects in our participants. The control videos were then framed accordingly. The experiment followed a block design, where each video was presented only once per run followed by 30 sec of rest (static gray screen). The order of presentation was pseudorandomized, with the caveat that the infant interaction and its respective control video were always presented consecutively of each other (e.g., one run would be S1-C1-S2-C2, whereas the other was S2-C2-C1-S1). Each participant completed two runs of this task. Inside the scanner, the participants also performed an auditory task immediately after the two video runs. Data for this task were used for another experiment and were, thus, excluded from the present study.

Emotional Rating of Stimulus
At the end of Session 3, all participants rated each stimulus using a 7-point Likert-type scale for two attributes, namely, emotional valence and arousal. The items for emotional valence range from strong feelings of pleasantness to strong feelings of unpleasantness, whereas those for arousal range from very excited to very calm. There were no differences in the emotional valence or arousal rating between groups ( Figure 3). N-way ANOVA revealed a significant effect in the valence rating because of the type of Figure 2. fMRI experimental design: An example of a typical run inside the MRI scanner. The participants were instructed to view the alternating videos for infant interaction or matched control. Videos were played for 30 sec with 30 sec of rest. The order of presentation was pseudorandom. Figure 3. Emotional valence and arousal of the video stimuli used. After the third session of recording, participants saw once again each of the stimulus videos used in the protocol, and rated them in two dimensions of emotion: valence (positive-negative; left graph) and arousal (high-low; right graph). N-way ANOVA revealed a significant effect in the valence rating because of the type of stimulus (infant or control video, F = 140.19, p < .0001), and an effect because of stimulus context (S1 or C1 vs. S2 or C2, F = 5.25, p = .01). For arousal ratings, N-way ANOVA showed a significant group effect (papa or control, F = 5.15, p = .02), and because of the type of stimulus (infant or control video, F = 25, p = .0004). S1 = infant videointeraction on knees; S2 = infant video-changing diaper; C1 = control video-opening box; C2 = control video-wrapping box. Bars denote mean; error bars denote SEM. stimulus (infant > control video, F = 140.19, p < .0001), and a smaller effect because of stimulus context (S1 or C1 vs. S2 or C2, F = 5.25, p = .01). For arousal ratings, n-way ANOVA showed a significant group effect (papa vs. control, F = 5.15, p = .02), and because of the type of stimulus (infant vs. control video, F = 25, p = .0004).
Preprocessing of fMRI data was conducted using MATLAB R2018b (MathWorks) and the SPM software package (SPM12 v7487, https://www.fil.ion.ucl.ac.uk/spm/). The first eight volumes for each run were discarded to allow for signal stabilization. Functional images were corrected for acquisition time, resliced, realigned, and normalized to match the Montreal Neurological Institute (MNI) template brain and were spatially smoothed using a Gaussian kernel with a FWHM of 4 mm. Participants with movement artifacts greater than 3 mm within the runs were discarded (n = 2; one PG participant for the first session and one CG participant for the second session).
We modeled the response of each participant to the stimuli using a general linear model in which the stimulus blocks were defined as predictors and convolved with the standardized model of hemodynamic response function, and the head motion parameters as nuisance factors. For analysis, we defined three contrasts, namely, S1-C1, S2-C2, and the overall S-C (a combination of S1 and S2 and C1 and C2). However, to avoid the unnecessary increase in the number of tests, we focused mainly on the S-C contrast and used the two other contrasts for exploratory analyses.
fMRI Whole-brain Analysis Whole-brain analysis for the interactions between group and session effects were measured using a single 2 × 3 flexible factorial model (Group × Session), in which each participant served as a random effect. Model parameters: six conditions, zero covariates/nuisance, 69 blocks (34 CG participants, 35 PG participants), having 73 effective degrees of freedom, 129 degrees of freedom left from 202 images. Data were drawn from the contrast map for S-C for each session and for all participants. To verify the cross-sectional and longitudinal effects as well as intra and inter-group effects, we performed the following analysis using individual t and F tests.

Cross-sectional Analysis
For each session (Session 2 and 3), we conducted t tests for the null hypothesis that the differences observed in a given voxel cluster between the PG and CG would be negligible. Two t tests were conducted for each session, one for the possibility that PG > CG and one for the opposite.

Intragroup Longitudinal Analysis
Intragroup longitudinal comparisons (i.e., session effects) were conducted using an F test for the overall session effect for each group. We also conducted t tests for each pair of sessions (e.g., PG Session 3 data vs. Session 1 data) for the null hypothesis that the differences observed in a given voxel cluster were negligible. In addition, to test the overall development from pregnancy to postnatal period, we added another t test for the Session 3 versus the average of Sessions 1 and 2, for both groups.

Intergroup Longitudinal Analysis
Intergroup longitudinal comparisons (i.e., interaction group-session effects) were conducted using an F test to determine the overall group-session effect. We also conducted t tests between the two groups for each pair of sessions (e.g., PG Session 3 data vs. Session 1 data > CG Session 3 data vs. Session 1 data) for the null hypothesis that the differences observed in a given voxel cluster were negligible. Similar to the intragroup analysis, we also tested the overall development from pregnancy to postnatal period in the PG compared with the CG by adding an additional t test for the Session 3 versus the average of Sessions 1 and 2 between the two groups.

Whole-brain Regression Analysis
To determine whether the observed neural activation in the PG and CG were related to any of the behavioral or hormonal measurements (covariates) obtained from the participants, we used whole-brain individual regression models with the S-C contrast of each group and session against each of the covariates. Diaz-Rojas et al. (2021) presented the data for Session 1; thus, we did not report these data in the current article.

Statistics for Whole-brain Analysis
For all whole-brain analyses, statistical maps were assessed both at (a) a FWE-corrected threshold of p < .05 at the voxel level or (b) an uncorrected threshold of p < .001 at the voxel level (i.e., cluster-forming threshold), and clusters were considered significant if they passed a cluster-level threshold of p < .05 after FWE correction. All significant voxels and clusters are reported. Using the WFU Pick Atlas toolbox for MATLAB (Maldjian, Laurienti, & Burdette, 2004 ;Maldjian, Laurienti, Kraft, & Burdette, 2003) and the AAL atlas of the human brain (Tzourio-Mazoyer et al., 2002), we matched the surviving clusters to known anatomical areas.

MVPA
MVPA was implemented using libsvm (Chang & Lin, 2011). The objective of this analysis was to examine whether there were smaller changes between Sessions 1 and 2 in the fathers' group that were not reflected in the wholebrain analysis. We first extracted the nonaveraged and nonspatially smoothed voxel data for each participant for the S-C contrast from the left dorsomedial prefrontal cortex (dmPFC). We focused on this ROI, because this area displayed the most consistent changes from pregnancy to postnatal period. To reduce bias, we defined this area anatomically using the left frontal superior medial area of the AAL atlas. Because of the large number of features (i.e., voxels) in this anatomical area, we performed feature selection to improve the performance of the analysis (Mahmoudi, Takerkart, Regragui, Boussaoud, & Brovelli, 2012). Feature selection was conducted using the one-way ANOVA F values ranking of each voxel, with the target labels of the training set (group or session) as the levels, selecting the top 125 voxels (equivalent to a 5 × 5 × 5 voxels cube) with the highest F values. Using these data, we trained a support vector machine (SVM) classifier with a linear kernel to perform a binary classification of a target label. These labels were of two kinds: session (early pregnancy, late pregnancy, or postnatal), and paternity status (same as group, either PG or CG). To validate the model, we performed leave-one-out cross-validation. In other words, we trained a model using voxel data from all participants included in the analysis except for one and used the excluded participant as test data. This procedure was repeated for each participant until all participants served once as test data. For each fold, feature selection was carried out independently using only the data from the training set. Then, we averaged the prediction accuracy for each of the test data to obtain the mean prediction accuracy of the classifier for that analysis. We also displayed the 95% confidence interval, which was calculated using the normal approximation method for the binomial confidence interval. High accuracy implied that the model could separate multivoxel data between participants belonging to one group/session from the other. In this manner, we established a relationship between neural activation patterns and the target label. Significance was assessed with an upper-tailed binomial test for the null hypothesis that the resultant classification accuracy emerged from a binomial distribution with a mean equal to the chance level (50%). In addition, we calculated the true positive rate (TPR) and true negative rate (TNR) for each ROI-target label pair.

Exploratory Analysis of the Individual Differences in the Paternal Brain
Clustering analysis was conducted using a k-means algorithm and the squared Euclidean distance as the distance metric. We set the number of groups as two (k = 2). We intended to group the participants based on the development of the paternal brain across pregnancy. Thus, we selected the percentage change between adjacent sessions as data for clustering (i.e., change from Session 1 to 2, and from Session 2 to 3). Next, we took the absolute values of these percentage changes. Negative or positive changes cannot be easily interpreted as improvements or regressions in the paternal brain (e.g., a negative change could be associated with decreased response because of familiarity with infant-interaction schemes or to perceived nonfamiliarity to the shown infant, which triggers ownchild/other-child effects). Thus, we focused solely on change as a metric for the development of the parental brain. However, measuring the percentage change may frequently lead to extremely large values, because fMRI data may have very small values. To account for this phenomenon, we took the base 10 logarithm of the absolute percentage change and then clipped to the outlier threshold any outlier values (defined as those values that are more than three scaled median absolute deviations away from the median). We focused on the neural data from the left dmPFC, which was the brain region that exhibited greater changes from pregnancy to postnatal period. This analysis was conducted only with data from the PG.
The k-means algorithm is sensitive to initial conditions and to the initial value of the random number generator. To combat this issue and validate the findings, we first performed the following procedure. We excluded four random participants from the set (12.5% of the total set) and performed k-means classification with the remaining data. We noted the cluster centers and repeated this procedure 10,000 times. Finally, we averaged the cluster centers over the iterations and performed one final clustering using these centers and the entire data set. For the final clustering, each data point was set as belonging to the group that it is closer to using the square Euclidean distance and without updating the position of the cluster centers, because this process is iteratively conducted in the k-means algorithm.
After dividing the participants into two groups, we calculated for the mean of each behavioral and hormonal covariate at each time point. Statistical analysis at the group and session levels was performed using a n-way ANOVA. Behavioral data that remained stable across sessions (i.e., age, years of education, weekly worktime and household income) were taken from Session 3. We repeated the above procedures with k = 3 and k = 4 groups to confirm that our results were not impacted by the arbitrary selection of the k parameter. In the case of k = 3, the results were consistent with those found with k = 2. One group was exactly the same as before (Group 2 in Figure 6); the remaining two groups showed no differences in their behavioral or hormonal traits, suggesting that the division is inconsequential and no better than the simplest division of k = 2. In the case of k = 4, one group had only four participants in it, making the interpretation of any results difficult because of lack of samples. Using k > 4 results in similar small-sized groups.

RESULTS
We recruited 72 men: 36 men without children as the CG and 36 first-time expectant fathers in the early to mid-pregnancy (GA: 10-30 weeks) of their partners for the PG. Each participant participated in an fMRI scanning session (Session 1-early pregnancy for PG) to record neural activation in response to videos that display infant-adult interaction scenarios. The stimulus videos consisted of four scenes, namely, the first two showing male-infant social interactions (S1: playing with an infant; S2: changing diapers) and the second two showing control non-infant interactions (C1: opening a box and removing a tripod; C2: wrapping a box with plastic). All videos were shown from the first-person view (Figure 2). The motor movements of each control scene roughly matched those of the corresponding infant-interaction scene. Approximately 4 months later, the participants were invited again for a second fMRI measurement (Session 2-late pregnancy for PG), and for an additional time subsequently 4 months later (Session 3-postnatal for PG). At each session, we measured various behavioral traits related to parenthood, relationship with the partner, SES and others (see the Methods section for a complete description of all recorded traits). Saliva samples were obtained to analyze oxytocin and testosterone. Table 1 summarizes these data.

Parental Brain Activities at the Three Time Points
We were interested in the cross-sectional snapshots of the paternal brain at each time point (i.e., early and late pregnancy and postnatal) to reveal whether and the extent to which the brain response to infant stimuli of fathers would differ from those of nonfathers. Moreover, we aimed to determine changes in the neural activities in terms of longitudinal developmental processes from pregnancy to the postnatal period.
First, we examined the paternal brain of expectant fathers and childless men using a whole-brain analysis at each time point. Diaz-Rojas et al. (2021) reported the cross-sectional results of the early-pregnancy period (Session 1, GA < 30 weeks); therefore, we excluded them from the current study. For Session 2 (late-pregnancy), we examined the overall response to the infant stimuli versus the control stimuli (i.e., combination of S1 and S2 vs. combination of C1 and C2, or the S-C contrast) in expectant fathers and control men to confirm the activation of the paternal brain at this stage. We found that, when taken together, all participants displayed widespread activation across the brain ( Figure 4A, see also Table 2). These activated areas correspond mainly to the human parental caregiving networks (Feldman et al., 2019;Abraham et al., 2014), that is, an emotional processing network (i.e., inferior frontal gyrus, ACC, insula, and amygdala) and a mentalizing network (i.e., the medial pFC, temporal poles, and superior temporal sulcus). A comparison between the average brain activities of the PG and the CG revealed decreased activation in the right dmPFC in PG than in CG (Table 3). Regarding Session 3, we found similar activations in the same areas of the paternal brain ( Figure 4B and Table 4), although they involve a larger portion of the brain. In addition, the PG exhibited  Only clusters with a FWE rate of p < .05 (pFWE) at the cluster or peak level are shown. False discovery rate-corrected (FDR) p values are also shown. Areas were defined based on AAL Neuroatlas. XYZ coordinates are in MNI space. p unc. = uncorrected p values; T = t test; PG = father group; CG = control group; + = positive effect (the higher the brain activity, the higher the covariate value); − = negative effect (the higher the brain activity, the lower the covariate value).  (Table 3). In summary, these results confirm the findings of Diaz-Rojas et al. (2021), that is, the parental brain is a staple of the male brain regardless of paternity status. Conversely, we observed substantial differences between the neural activities of fathers and nonfathers in response to the infant caregiving context, especially in the mentalizing areas of the human caregiving network.
Second, we focused on the relationships between the cross-sectional brain response toward infant stimuli and behavioral or hormonal traits related to parenting. We examined the relationships in both groups of participants in Session 2 and Session 3 and used whole-brain analysis with individual regression models for each behavioral and hormonal covariate and the S-C contrast. Table 3 summarizes the clusters in the brain, which exhibited significant correlations with behavioral and hormonal covariates. In Session 3 (postnatal period), we noted several clusters of brain activity correlated to several behavioral and hormonal traits in the PG. Of interest was postnatal attachment, which indicated a strong and widespread negative relationship with the activation from the S-C contrast, mainly across the emotional network (insula, ACC, dorsolateral pFC and supramarginal gyrus) and to a lesser degree with the mentalizing network (precuneus). Alternatively, we found neither fetal attachment (the prenatal counterpart to postnatal attachment) nor any other behavioral traits correlated with the S-C contrast in Session 2 (late-pregnancy period) except for household income in CG. For the hormonal measurements, we found only a negative relationship between oxytocin and the right supplementary motor cortex in PG in Session 2. No other brain areas in Session 2 or 3 showed any other relationship with oxytocin or testosterone in the PG and CG.

Longitudinal Development of the Paternal Brain
The question remains about whether and how expectant fathers' neural activation patterns observed in response to infant stimuli begin to change across the pregnancy period to childbirth. Continuing the whole-brain approach, we compared changes in the neural response to the infantinteraction stimuli in both groups from one time point to another and between groups. For the PG, we detected a change from Session 2 to Session 3 in the left dmPFC (Table 5). Conversely, we were unable to detect any differences in the PG between pregnancy sessions (from Session 1 to 2), or in CG between any of the time points. Comparing the two groups revealed an interaction effect between session and group in the dmPFC (left and right). A further exploration of this interaction revealed that the dmPFC in PG displayed an increased response from Session 1 to Session 3 than that of the CG. The PG exhibited no major difference in responses from Session 1 to 2; thus, we averaged them as the pregnancy session to view the overall change in the paternal brain from pregnancy to postnatal period. We found that fathers remarkably exhibited increased activation in the dmPFC from pregnancy to the postnatal period compared to brain activities observed in the CG across the same time period ( Figure 5A).

MVPA for Paternal Brain Development
These results suggest that changes in the neural response to infant-interaction visual stimuli in the dmPFC in expectant fathers occur around the delivery period without an apparent change during pregnancy. However, we need to consider the possibility that the whole-brain analysis may lack sufficient statistical power to detect smaller changes in the brain of expectant fathers during this period. Actually, Diaz-Rojas et al. (2021) found that small changes in the development of the paternal brain in early pregnancy (∼20 weeks GA) could be evidenced using MVPA. Therefore, we used MVPA to examine the multivoxel patterns of activation of the dmPFC across the two time points during pregnancy.
To conduct MVPA, we used a SVM model that performs a supervised classification of session (early or late pregnancy) using the multivoxel data from the left dmPFC, as it was the area that consistently displayed the majority of changes from pregnancy to postnatal period. In the PG, the algorithm successfully classified the pregnancy period Labeling was conducted using AAL Neuroatlas. Only the top three defined areas per clusters are shown. Clusters with less than 3 voxels are not shown. The activation maps were thresholded at p < .001 (uncorrected at the voxel level) combined with p < .05 (FWE-corrected at the cluster level). with a higher-than-chance accuracy (mean ± 95% confidence interval [CI]: 66.17% ± 13.25, p = .005, TNR = 0.66, TPR = 0.67; Figure 5B). In other words, the multivoxel activity patterns of the left dmPFC differ between early and late pregnancy. Conversely, the algorithm could not classify the session in the CG with a higher-thanchance accuracy (55.88% ± 13.91, p = .19, TNR = 0.54, TPR = 0.58). This finding suggests that the underlying neural encoding patterns in nonfathers remain the same between the two time points (Sessions 1 and 2). In addition, we classified group (PG or CG) in each session (1, 2, and 3) using MVPA to determine whether the activity patterns of the left dmPFC during pregnancy were specific to fathers. We found that the classification of paternal status (i.e., belonging to the PG or CG) was possible using the multivoxel data of this ROI at either Session 2 or 3, respectively (Session 2, PG vs. CG: 64.70% ± 11.65, p = .01, TNR = 0.66, TPR = 0.64; Session 3: 65.67% ± 11.67, p = .007, TNR = 0.63, TPR = 0.66) but not from the data of Session 1 (Session 1, PG vs. CG: 55.88% ± 12.11). These results suggest that the activity patterns of the dmPFC in response to infant-related stimuli will remarkably change only among expectant fathers, especially during late-pregnancy or early postnatal period ( Figure 5C).

Effect of the Dynamics of the Paternal Brain Development on Parental Phenotypes
Our findings suggest that the paternal brain, especially in the medial pFC, in expectant fathers, mainly develops across the last weeks of pregnancy to the early period of childbirth. However, previous studies reported that various types of change occur in the developmental processes of paternity. Berg and Wynne-Edwards (2001) proposed that men display distinct patterns of variations in testosterone across the pregnancy period. Moreover, Diaz-Rojas et al. (2021) found differences in neural encoding patterns between expectant fathers during early pregnancy. After childbirth, Abraham et al. (2014) suggested that not all men exhibited the same patterns of development in the paternal brain. Given these findings, the current study conducted post hoc exploratory analysis to further investigate individual differences in the development of the paternal brain in postnatal fathers and explored the factors that may contribute to it. We focused on the individual differences observed with dmPFC activation, which indicated larger changes from pregnancy to postnatal period. In addition, we examined the relationship of the degree of change in activation in response to infant-interaction stimuli with psychological/behavioral characteristics and hormone concentrations in the PG.
One method for quantifying these developmental profiles is to categorize participants on the basis of their progression in dmPFC activation across the pregnancy. However, using this method may yield an excessive number of groups given the limited sample of fathers. Thus, we performed automatic partitioning of the sample into two groups using k-means clustering based on changes in the left dmPFC in response to infant stimuli from Session 1 to 2, and from Session 2 to 3 instead of manually separating fathers into groups. Consequently, data were separated into two subgroups, namely, Group 1 (exhibited relatively high levels of change from Session 1 to 2, but low levels of change from Session 2 to 3 [n = 20] and Group 2 [displayed relatively low levels of change from Session 1 to 2, but high levels of change from Session 2 to 3 [n = 12]; Figure 6A and B). Using a two-way ANOVA, we found an overall session effect for both groups (F = 5.60, p = .005, df = 2), but not a group (F = 1.20, p = Figure 5. Changes in the paternal brain across pregnancy and after childbirth. (A) Interaction between time (Session 3 vs. Sessions 1 and 2) and group, which displays the clusters in which fathers showed an increase in activity from Sessions 1 and 2 to Session 3 compared with control. The activation maps were thresholded at p < .001 (uncorrected at the voxel level) combined with p < .05 (FWE-corrected at the cluster level). dmPFC = dorsomedial prefrontal cortex. (B) Classification of recording session and paternal status using multivoxel data. Using a SVM classifier, we successfully discriminated between fathers in the early pregnancy and those in late pregnancy from the multivoxel neural patterns in the left dorsomedial prefrontal cortex. (C) Using the same procedure as in B, we discriminated between fathers (PG) and control (CG) based on neural patterns from the second and third recording sessions. Error bars denote 95% CI. *p < .05 (upper-tailed binomial test).
Using these categories, we compared hormonal profiles and behavioral scores across pregnancy among the subgroups and observed important differences ( Figure 6C; Table 6 for all statistical tests). Specifically, Group 1 showed higher positive image toward parenting and fetal/postnatal infant attachment scores than Group 2 (F = 11.68, p < .0001, and F = 8.13, p = .0002, respectively). Weekly worktime also showed marginal group differences (F = 4.68, p = .038), but after correction for multiple comparisons, this difference did not reach a Figure 6. Clustering analysis of change in the left dorsomedial prefrontal cortex (dmPFC) in expectant fathers between sessions. (A) Scatter plot of the change in dmPFC activation between Sessions 1 and 2 versus between Sessions 2 and 3. Change was measured using the absolute logarithm of the percentage change between the two sessions. Squares in bold represent the cluster average centers, which are calculated from 10,000 iterations of clustering using 87.5% of the total sample of fathers. (B) Line plots showing the S > C contrast signal for the dmPFC for both groups. (C) Line and violin plots for each covariate divided by groups. *Group differences (two-way ANOVA, p < .001, uncorrected). Error bars in line plots correspond to standard error of the mean. In the violin plots, each colored dot corresponds to one participant's data; white dots represent the median of each group; the thick dark lines inside each violin plot represent the range between the first and third quartiles. significant level. Despite these behavioral differences, both subgroups had similar average levels of dmPFC activation at Session 3 ( Figure 6B), which suggests that the developmental trajectories that each group undergoes during pregnancy may influence the type of father each man will become.

DISCUSSION
This study aimed to elucidate when and how the paternal brain develops in first-time fathers from the earlypregnancy period to 4 months postnatal. We found that parts of the mentalizing network among first-time fathers remarkably change from mid-pregnancy (GA > 30 weeks) to the postnatal period. Particularly, the study obtained four findings. First, differences are evident in the activation in the dmPFC between prenatal and postnatal recordings in first-time fathers but not in control men between the same time points. Second, using MVPA, we determined different neural patterns of activation in the left dmPFC between (a) early-to mid-pregnancy and late-pregnancy recordings in fathers and (b) between fathers (late pregnancy and postnatal period) versus the CG at matching time points. In other words, the activity patterns of expectant fathers in the left dmPFC for infant-related stimuli substantially changed from mid-pregnancy onward. Third, a strong relationship exists between emotional attachment toward their infants and neural activation in response to infant-interaction stimuli among postnatal but not prenatal fathers. Finally, differential developmental trajectories in the dmPFC were related to different paternal phenotypes.
Regarding the first finding, fathers (after childbirth) displayed an increased activation in the dmPFC compared with their recordings during pregnancy (before childbirth) and that of childless control men. The dmPFC is one of the main parts of the mentalizing network (Rilling & Mascaro, 2017). In infant caregiving contexts, the mentalizing network is considered to be involved in imaging or reasoning of the psychological states of infants through nonverbal signals, such as facial expressions (Feldman et al., 2019). Such active caregiving experiences with their infants in everyday life may enhance the dmPFC activation of fathers after childbirth. In fact, Abraham et al. (2014) showed that the time fathers spent in childcare is correlated with connectivity between the amygdala and the superior temporal sulcus, which are parts of the parenting brain networks. In addition, Diaz-Rojas et al. (2021) revealed that past experience with infant caregiving (e.g., taking care of a younger sibling/nephew) modulated the activity in certain areas of the mentalizing network in expectant fathers. Thus, active caregiving experience may be a strong modulator of the development of the paternal brain in humans.
Importantly, the current study provides evidence that the development of the parental brain does not start after childbirth. Instead, a certain degree of change in the dmPFC occurs from the early to the late-pregnancy period. MVPA could identify significant changes between early-to mid-pregnancy and late-pregnancy fathers using the multivoxel neural patterns of the dmPFC. This finding suggests a notable difference between the neural encoding mechanisms between the two time points. Furthermore, we discriminated the difference between late-pregnancy fathers and control, but, importantly, not between earlymid-pregnancy fathers and their control. These findings support the assumption that although direct caregiving experiences with their infant may strengthen the parental brain, a certain of degree of development is initiated during the pregnancy period before the actual caregiving contexts can occur. Now, what could be the mechanisms that underlie these changes? One possibility may be fluctuations in hormonal levels. A variety of hormones, such as testosterone, oxytocin (Edelstein et al., 2017;Weisman et al., 2014), estrogen, progesterone, vasopressin (Bales & Saltzman, 2016), and even cortisol (Bos, Montoya, Terburg, & van Honk, 2014), have been associated with parental behaviors in men. For this study, we measured testosterone and oxytocin at each time point. Though we could not find significant differences between early and late pregnancy hormonal levels, it remains possible that other parental-related hormones, or even slight changes in testosterone or oxytocin that were masked by the large degree of variability, underlie these changes in the parental brain during pregnancy. In turn, these hormonal changes can be triggered by internal (developing sense of fatherhood, anxiety over impending life changes, etc.) and external (seeing the partner's belly growing up, hearing fetus' heartbeats, sensing the fetus' movements, etc.) factors. Further in-depth studies with larger samples and focusing on the pregnancyspecific changes could help elucidate this matter. A second possibility is that the cognition about the partner's pregnancy and the realization of the upcoming fatherhood promote these changes. During the pregnancy, expectant fathers may become more aware or attentive toward their partner's condition, which increase cognitive functions in areas of the mentalizing network, the dmPFC included. Likewise, knowing that they will soon become fathers may make men more sensitive to infants or infant caregiving contexts, which in turn can result in functional changes to core areas of the parental network. A previous study by our group (Diaz-Rojas et al., 2021) found a link between recent experiences of childcare and patterns of neural activation in mentalizing areas of the parental brain (STS and temporal pole) in early-pregnancy expectant fathers, supporting the possibility that the dmPFC changes here observed may be partially explained by increased experiences of interaction with infants.
Interestingly, in comparison with the nonfather group, no other brain region showed similarly strong differences in activation in response to infant-interaction stimuli throughout pregnancy. In the same manner, the study observed a lack of widespread differences in the neural response between the PG and CG. For first-time fathers and childless men, the neural response to the stimuli in other areas of the parental brain remains relatively constant across the three recording sessions. One possible reason is that expectant fathers and childless men display a well-defined activation pattern in the parental brain when exposed to infant stimuli (Diaz-Rojas et al., 2021; Figure 4A). Thus, the strengthening of the paternal brain response after childbirth may effectively be limited because of a ceiling effect. Another possibility is that further development of the paternal brain is dependent on experiences of infant-interaction, as argued by previous studies. Primary caregivers, regardless of gender, were found to exhibit a stronger activation of the emotional processing network when watching videos of their infant compared with fathers as secondary caregiver (Abraham et al., 2014). According to this, it may be interpreted that the fathers participating in this study were still lacking in caregiving experiences necessary for the development of other areas of the parental brains, particularly those related to emotional processing. Indeed, on average, the participants in the father group had their postnatal recording session (Session 3) at approximately 4 months after childbirth. At this period, the infant mainly relies on maternal care (e.g., breast feeding) through skin-to-skin contact with the mother. Consequently, opportunities for direct father-infant interactions may be limited. Furthermore, the majority of the PG participants (77%) reported spending 7 hr per week or less alone with their infant; a third of the PG participants indicated spending less than 1 hr per week. We were unable to establish a direct link between time spent with their infant and the development of the paternal brain. In summary, we postulate that although fathers reported limited childrearing experiences, they still had some; these experiences are just enough to trigger a "first stage" in the full development of the postnatal paternal brain. Only with subsequent time spent caring their child will the development of their paternal brain mature; this process will depend on the quality and type of childrearing experiences. Future studies on human paternity focusing on longer periods after childbirth may help in elucidating this process.
In terms of the third finding, individual differences were also observed in the fathers' group. Postnatal infant attachment exerted a negative linear relationship with the neural response to infant stimuli in several brain areas, including the mentalizing (precuneus) and emotional processing networks (left insula, anterior cingulate, and the left lateral pFC). How, then, should these results be interpreted in terms of postnatal fathers? One possibility is that fathers with increased attachment toward their infant may feel a stronger disconnect with unfamiliar infants presented in the stimuli, which led to the reduced neural response. Previous studies indicated that parents' preference for their infants in comparison with unknown infants is reflected at the neural level ( Young et al., 2017;Wittfoth-Schardt et al., 2012;Leibenluft, Gobbini, Harrison, & Haxby, 2004;Nitschke et al., 2004). The emotional processing network is considered to be involved in emotional and cognitive empathy, including the understanding of an infant's emotions and pain (Rilling, 2013;Swain, 2011). Particularly, the left insula has been shown to have a greater activation in fathers in response to images of their own children versus those of other children ( Wittfoth-Schardt et al., 2012). Notably, the same relationship was not observed between prenatal fetal attachment and the neural response to infant-interaction stimuli (before childbirth). A systematic review of studies on prenatal and postnatal parent-infant attachment suggested that little conclusive evidence links prenatal fetal attachment to postnatal parent-infant relationship in fathers (Cataudella, Lampis, Busonera, Marino, & Zavattini, 2016). Although the current study could not elucidate the actual psychobiological mechanism that underlies the feeling of attachment of fathers toward their infants, the link between the paternal brain and feeling of attachment may possibly only form at postnatal period following real caregiving experiences.
Finally, the timing of the changes in the response of the dmPFC was related to particular phenotypes in expectant fathers. Fathers who showed early changes (i.e., relatively large differences between Sessions 1 and 2) in the left dmPFC were associated with high scores for positive attitude toward parenting and fetal/infant attachment than those who exhibited the change at a later time (i.e., between Sessions 2 and 3). The positive attitude toward parenting questionnaire includes items related to emotional judgments toward the unborn infant and the act of parenting, as well as ratings of personal growth and satisfaction toward parenting (Inori & Kato, 2011). Postnatal attachment was correlated with the score for father development (Figure 2), which is a metric associated the fathers' feeling of developing a sense of fatherhood and personal growth. On the prenatal side, high levels of fetal attachment were related to a more balanced representation of the unborn infant ( Vreeswijk, Maas, Rijk, Braeken, & van Bakel, 2014) among first-time fathers. These metrics reflect mental processes modulated by the dmPFC, which is involved in understanding one's identity (Gusnard, Akbudak, Shulman, & Raichle, 2001), self-and other judgments (Piva et al., 2019) and emotion regulation (Ford & Kensinger, 2019). Expectant fathers with early changes in the dmPFC may gain a head start compared with fathers with late dmPFC changes. We also found a difference in average weekly working hours between Group 1 and Group 2 fathers, albeit only marginal (i.e., before correction for multiple comparisons). In previous studies, working hours have been considered to be a factor that affects the expression of paternal nurturing behavior (Bünning, 2020;Johnson, Li, Kendall, Strazdins, & Jacoby, 2013). It also may be related to individual differences in the dmPFC activity and its developmental trajectories shown in the present study; however, further research is needed to confirm this. It is important to highlight that the majority of the participants had high scores across the behavioral metrics related to parenting. In comparison with other studies, our sample may be positively biased and may show the differences that exists in the high end of the paternal spectrum or simply may be insufficiently varied to provide an appropriate comparison with previous findings.
Scientists propose that testosterone and oxytocin play an important role in the parenting behavior of men (Edelstein et al., 2017;Weisman et al., 2014). The current study found a transient negative correlation between oxytocin and the SMA during late pregnancy in fathers. The SMA is involved, among other things, in motor imagery (Guillot et al., 2009). In caregiving contexts, a recent review of fMRI studies of the paternal brain (Provenzi et al., 2021) summarized that the SMA also displays a response to infant stimuli. Thus, the area was proposed to be involved in the creation of time-dependent motor memories, which are crucial for the development of the imagery of self-other interactions, including parentinfant exchanges (Lindner, Schain, & Echterhoff, 2016). People with high oxytocin levels may easily imagine these exchanges when observing the father-infant interaction videos, especially in comparison with control, which leads to the reduced activation. Alternatively, we were unable to identify differences in testosterone levels between the time points, or to determine any correlation between testosterone level and neural response to the infant stimuli. Berg and Wynne-Edwards (2001) demonstrated a sharp decline in testosterone levels among expectant fathers in the period around the postnatal period. Such a decrease in testosterone level is seemingly time-sensitive, which is limited to a 1-month time window after childbirth (Gettler, McDade, Feranil, & Kuzawa, 2011). Based on these findings, measuring testosterone levels earlier in the postnatal period may be necessary. Another possibility for the lack of findings may be related to the large variability in hormonal measurements, particularly given the limited sample size. Thus, further studies are required to consider the timing of hormonal measurements and the large degree of variability to properly assess their impact on the development of the parental brain.
One of the concerns arising from our findings is the ambiguity related to whether the neural activations observed in response to infant stimuli reflect a so-called "parental brain," or more general "social brain." Because of the characteristics of the stimuli used to investigate the parenting brain (e.g., Abraham et al., 2014), it is difficult to discriminate a child-related component (i.e., parenting) from a general social component in the brain activations in response to our stimuli. A recent systematic review of the fMRI studies in fathers (Provenzi et al., 2021) has noted essential difficulties in measuring the parental brain networks solely, because they overlap with other social networks. Moreover, the issue of whether the parental brain would be a separate, specialized network from those of the social brain is one that, to our knowledge, has not been solved. In light of this, the label of "parental brain" used in this study follows the conventions of similar other studies (Provenzi et al., 2021;Feldman et al., 2019) and refers to the areas that respond to infant stimuli. Most noteworthy is that we could identify father-specific differences in some of these brain areas in contrast with those of the control men. Even if the response to the infant stimuli is overlapped with those to social ones, the findings clearly show that these were based on parental-specific experiences.
In conclusion, we found strong evidence that the development of the paternal brain begins from the pregnancy period, a time when expectant fathers lack experience in active caregiving with their child. Importantly, we also present evidence that the timing of the development of the paternal brain is related to different parenting phenotypes. Alhough we could not pinpoint the mechanism underlying these changes, we propose that an interplay between mid-pregnancy/perinatal hormonal changes and postnatal parental experience may mediate them. Finally, we suggest these findings on the variability of paternal development may serve as reference for the formulation of parental support programs for expectant fathers (i.e., during pregnancy), which can be applied to diverse individuals. This direction may prove particularly effective in men who struggle to establish their sense of fatherhood and who may require a different approach for easing into parenthood.
The psychobiology of fatherhood is a complex matter, and there remains much to be elucidated; future research should focus on integrating and deepening the accumulated findings-hormonal, behavioral, and neurologicalin both animal and human models to provide a systematic understanding of the exact mechanisms that underlie and modulate paternal behavior.