Bipolar disorder affects an estimated 2.5% of the population, with higher prevalence for spectrum conditions1. The onset peaks in late adolescence and early adulthood2, however, delayed recognition and misdiagnosis remains a challenge. Untreated illness is associated with substantial morbidity and mortality early in the course3, and therefore timely and accurate diagnosis of bipolar disorder is critical to facilitate prompt treatment.
Bipolar disorder runs in families, and therefore the children of bipolar parents are an identifiable high-risk group ideally suited for risk prediction studies4. Family studies have shown that the bipolar trait segregating in families includes major depressive disorder, bipolar I, II and schizoaffective bipolar disorder5. The penetrance and spectrum varies between families and according to the subtype of illness. Furthermore, longitudinal prospective studies of high-risk offspring have provided strong evidence that the illness often debuts with depressive episodes6.
While key risk factors for the development of bipolar disorder have been identified such as characteristics of the parental age of onset and clinical course, early adversity, and antecedent clinically significant symptoms7,8, translatable risk prediction tools for clinicians do not exist or are in the early stages of development.
Given the heterogeneity in age of onset, it is imperative to use survival models rather than logistic regression or classification methods. Time-varying covariates are exposure variables that can vary with time across individuals, such as level of anxiety, or antecedent symptoms Given the importance of antecedent risk factors contributing to the risk of bipolar disorder together with the variable age of onset, it is important to use methods which accommodate time-varying covariates, such as the Cox model or discrete- time survival models with time-dependent covariates9. In addition, including time-varying covariates allows the model to use the most recently available information for each individual.
Risk prediction tools attempt to incorporate multiple risk factors into a single model to estimate the probability that an individual will develop an outcome in the future. More recently, the use of neural networks has become increasingly popular in research for risk prediction. The goal of neural networks is to learn the relationship between a set of predictors and response(s) (i.e. target outcome(s)). The building blocks of neural networks are known as nodes, which are organized into layers and connected to one another through weights. Feed-forward neural networks, a common type of neural network, often have an input layer, one or more hidden layer(s) and an output layer. The information is distributed through the neural network in one direction, beginning at the input layer and finishing at the output layer10,11. (See supplemental material: Additional Methods – Neural networks and the discrete survival model).
Some advantages of neural networks are that they do not rely on strict assumptions and that they can accommodate non-linear relationships in the data10,12. Recently, recurrent neural networks have been used for survival analysis applications with time-dependent covariates13. Alternatively, discrete survival analysis has been extended to the field of neural networks in order to accommodate time-dependent covariates in prediction of survival10,14−16. The neural networks which extend upon discrete survival analysis lead to simpler implementation and interpretation compared to more complex methods such as the use of recurrent neural networks. That being said, the Ohno-Machado (1996)15 and Ohno-Machado (1995)16 neural networks involve the use of multiple neural networks, which adds complexity and can become computationally intensive.
The purpose of this article is to explore the use of a neural network, known as the Partial Logistic Artificial Neural Network (PLANN)10 to predict the time to diagnosis of bipolar spectrum disorders in the offspring of parents with confirmed bipolar disorder. PLANN is based on the logistic model for discrete survival analysis9. In this paper, we compare the two approaches. Both PLANN and the logistic model for discrete survival analysis predict the probability of an individual experiencing an event within a given time frame conditional on the individual not yet having experienced the event, which can be useful information for clinicians. The prediction of which offspring are at greater risk of bipolar disorder over time may allow for more proactive monitoring and prevention (reducing stress, improving sleep, healthy lifestyle choices).