The Diagnostic Properties of the Translated Chinese Whooley Questionnaire as A Case-Finding Instrument for Depression Among Chinese Women During and After Pregnancy

Introduction: Rising prevalence and undetected perinatal depression has been described in many countries and report that treating those who are already symptomatic, more effort should be targeted towards screening strategies to identify perinatal depression at the early stage. The Whooley questions is the recommended case nding strategy to aid the identication of perinatal depression. An ocial Chinese version has not been validated. The aim of this study was to evaluate the diagnostic accuracy and stability of the translated Whooley questionnaire against the gold standard measurement during pregnancy (antenatal) and early after pregnancy (postnatal). Materials and method: This observational study recruited 131 pregnant women from antenatal clinics in a hospital setting from September 2019 till May 2020 in Hong Kong. We translated the Whooley questionnaire in Chinese and evaluated self-reported responses against an interviewer assessed diagnostic standard (DSM-IV criteria) among 107 women receiving antenatal care at 26-28 weeks gestation. We calculated sensitivity, specicity, positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio, negative likelihood ratio and diagnostic odds ratio, with DSM-IV diagnosis taken as the gold standard. Results: Antenatally, the Chinese Whooley questions had a sensitivity of 79% (95% CI 54.4-93.9), a specicity of 97% (95% CI 90.4-99.3), a positive likelihood ratio of 23.2 (95% CI 7.4-72.1) and a negative likelihood ratio of 0.2 (95% CI 0.1-0.5) in identifying perinatal depression. Conclusion: Prevalence of depression in pregnancy have increased and screening early remains a signicant tool in Western countries. The translated Chinese Whooley questionnaire appears to have acceptable diagnostic accuracy and can be implemented into health services among Chinese population as only requiring two yes/no questions.


Introduction
Depression is the commonest mental health problem and is the major contributor to worldwide disability and global burden of disease. Perinatal depression is de ned as depression during pregnancy, around childbirth or within the rst year postpartum. It affects up to 20% of pregnant women and is a major public health burden across the globe. When pregnant, women are at a greater vulnerability to having depression. Antenatal depression was found to affect 10.7% of women in a meta-analysis of 21 studies across diverse countries and cultures. 1 In an epidemiological study of 357 Hong Kong Chinese women, 18.9% reported suffering from antenatal depression in the second trimester to 22.1% in the third trimester. 2 Given that even small reductions in population prevalence have a greater public health bene t that treating those who are already symptomatic, more effort should be targeted towards screening strategies to identify perinatal depression at the early stage.
Perinatal depression is associated with a range of adverse outcomes. 3 Evidence suggests an association between depressions experienced during pregnancy (prenatal depression) and adverse neonatal outcomes, poor self-reported health, substance abuse and alcohol abuse, and poor usage of antenatal care services. 4 Postnatal depression has been shown to have a substantial impact on the mother and her partner, mother-baby interactions 5 , the family 6 and on the longer term emotional and cognitive development of the baby. 7 Despite the signi cant disease burden of perinatal depression, it often goes undetected during the antenatal stage in Hong Kong. Research suggests that women who experience perinatal depressive symptoms use more healthcare resources during pregnancy than do women who are not depressed, which potentially affords providers greater opportunity to screen for and address depressive symptoms during the antenatal period. 8 Research have found that among women with postpartum depression, over 50% had depression identi ed either before or during the pregnancy. 9 Despite the signi cance of the problem, no screening tool for perinatal depression are used in Hong Kong and guidance are taken predominantly from Western countries. Screening women at the perinatal stage for possible depression has been identi ed as important by both healthcare professionals and pregnant women to implement early interventions that would prevent further adverse outcomes occurring at later postnatal stages. 10,11 In UK, the National Institute for Health and Care Excellence (NICE) produced guidelines on antenatal and postnatal health. 12 These set out recommendations for the detection and treatment of mental health problems during pregnancy and the postnatal period. As part of these guidelines, NICE endorsed a case-nding strategy by recommending the use of two "ultra-brief" questions to aid the identi cation of perinatal depression (see Appendix 1); these questions are often referred to as the Whooley questions. NICE have updated their guidelines in which they continue to recommend the use of the Whooley questions during pregnancy and the postnatal period (NICE, 2014) and validation study is currently being conducted 3 . If someone respond positively to either question a more comprehensive assessment is carried out, to determine whether or not an individual is depressed. The Whooley questions have been validated previously in a small sample of 152 women during pregnancy and the early postnatal period in the UK, with a sensitivity of 100% (95% CI: 77-100%) and a speci city of 68% (95% CI: 58-76%) during pregnancy, with similar estimates during early postnatal period. 13 Locally, a population-based study of women in Hong Kong completed the Edinburgh Postnatal depression scale (EPDS) in the second and third trimesters and at 6 weeks postpartum and found that EPDS can identify high risk women for postpartum depression and that screening all pregnant women in the second trimester can be a secondary preventive measure. 14 A Chinese version of the EPDS has been tested in Hong Kong and demonstrated good reliability and validity 15 , however this is routinely used during postnatal periods and not routinely used during the antenatal stage. A reason for this, could be that EPDS have a larger number of items, relatively lower sensitivity (correctly identifying true cases) and varying cut-off points used across different populations compared to Whooley questions. Diagnostic tests are regarded as providing de nitive information about the presence or absence of a disease or condition. By contrast, screening tests such as the Whooley questions places fewer demands on the health care system and more accessible, less invasive, less expensive and less time consuming. Given the known effects of screening during pregnancy in preventing depression, the obvious next step is to determine to which these tests are able to identify the likely presence or absence of depression so appropriate decision making can be encouraged in the health care system in Hong Kong.
There are limited studies examining depression case-nding questions, such as the Whooley questions and the EPDS 16,17 in Hong Kong. The primary objective of this study to evaluate the diagnostic accuracy of the Whooley questionnaire against the gold standard measurement during pregnancy (antenatal) and early after pregnancy (postnatal). In addition, to identify the stability of positive or negative screening of depression between antenatal stage of pregnancy and postnatal, to estimate if earlier testing optimizes depression screening.

Methods
Participants were recruited from antenatal outpatient clinic from one Hong Kong public hospital. Criteria for selecting the subjects were (1) 18 years of age or older; (2) Cantonese speaking; (3) Hong Kong resident for more than 1 year; (4) singleton pregnancy; (5) no serious medical or obstetrical complications. Women were approached sequentially while attending the antenatal clinic at 26-28 weeks gestation for routine antenatal talk. Convenience sampling was used to recruit all eligible subjects and hence all eligible ones were approached and invited to participate in this study. Women were excluded if they were less than 37 weeks gestation or did not speak Chinese, were not literate, were planning to move or be away from Hong Kong after birth.

Index test
Whooley questionnaire comprised two brief case-nding questions recommended in NICE guidelines in United Kingdom. Each question was responded as "yes" or "no". A "yes" response to either question was considered a positive screen. It was translated in Chinese version by two health care professionals and consensus reached by a consensus meeting. The consensus version was then back translated in English and compared with the original version by KL.

Diagnostic gold standard
To con rm the presence or absence of a current depressive episode, the DSM-IV (Diagnostic and Statistical Manual of Mental Disorders, fourth edition) diagnostic criteria for major depressive disorder were administered in an interview by telephone. 18 Guidance for the administration and interpretation of the criteria was taken from the Structured Clinical Interview for DSM-IV-Clinical version. 19 Semi-structured questions to identify depressive symptoms were asked in a format suitable for verbal interview 20 and have previously been shown to be valid when used over the telephone or face to face. 21,22 The interview questions and DSM-IV criteria used in our study are available in Appendix 1.

Procedure
The study was conducted in two phases in order to validate the utility of the questions in the antenatal and postnatal periods, as recommended by National Institute for Health and Clinical Excellence. 23 The study is reported according to the STARD (Standards for the Reporting of Diagnostic accuracy studies) statement. 24 During the antenatal phase, participants self-completed the questionnaires which consisted of demographic data and the Whooley questionnaire while attending antenatal clinic. A researcher (C.C) was available to answer questions if necessary.
During the postnatal phase, at about ve to six weeks postnatally, a copy of the questionnaire with a self-addressed envelope was mailed to the participant and returned within seven days. Non-respondents by seven days were contacted by telephone to remind them completing the questionnaire or to complete the questionnaire by interview over the telephone.

Statistical analysis
Baseline characteristics were reported for age, level of education, employment status, marital status, having existing children, history of mental health problems and responses to the individual Whooley questionnaires. Data were reported using proportions, means and standard deviations. We assessed the diagnostic accuracy and utility of the Whooley questionnaire by examining its sensitivity, speci city, positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio, negative likelihood ratio and diagnostic odds ratio, with DSM-IV diagnosis taken as the gold standard. Each such measure was accompanied by a 95% con dence interval (CI). Values of these statistics were based on a positive case identi ed with a positive response to at least one of the two Whooley items. The diagnostic accuracy and utility measures were obtained separately at antenatal and postnatal timepoints.
Stability of positive and negative screening between the two timepoints was assessed according the number of cases that remained stable or changed diagnosis. We anticipate that the stability of thew Whooley questionnaire shall be the same to the DSM-IV in changes of cases between the two timepoints. Stability or change in depression status between timepoints was tested by McNemar test.

Results
From September 2019 till May 2020, a total of 131 participants were recruited. One hundred and seven participants completed the Whooley questionnaire and DSM-IV interviews at the antenatal stage. Of those, seventy-eight participants (73%) also completed the Whooley questionnaire and DSM-IV interviews at the postnatal stage (Fig. 1). Baseline characteristics of both groups are shown in Table 1, overall participants that continued in the study at the postnatal follow-up were similar in characteristics to the whole sample at the antenatal baseline stage.   Table 2). The McNemar test suggested that the DSM-IV and Whooley questionnaire remained stable between the antenatal and postnatal stage with no signi cant differences found ( 2 = 0.05, p = 1.00, 2 = 1.8, p = 0.26 respectively) ( Table 3).

Discussion
From our study we found that the two translated case nding questions endorsed by the National Institute for Health and Clinical Excellence showed promising diagnostic characteristic among Chinese antenatal population with a strong speci city and moderate sensitivity, suggesting a follow-up questionnaire may not be needed. The Whooley questionnaire among the Chinese population appeared to be stable and so acceptable for use early and later pregnancy and birth.
A recent meta-analysis of 21 studies on depression during pregnancy indicated a prevalence of antenatal depression approximately 10.7%, ranging from 7.4% in the rst trimester to 12.8% in the second trimester 26 . The rate of antenatal depression in individual studies were found to be as high as 24% 26 . Chinese researchers have published numerous studies examining the rates of depression during pregnancy, depending on the study, these rates varied from 5.5-23.1%. 27,28 Lee et al, found that the rates of antenatal depression ranged from 18.9% in the second trimester to 22.1% in the rst trimester in a cohort of 357 women in Hong Kong. 2 Our nding was similar to that in the general population in Hong Kong. The study's overall positive predictive value is reported to be 83.3% of screening for depression during antenatal stage and negative predictive value is reported to be 95.5% that truly do not have the condition can therefore be generalize to the population.
Evidence from a Cochrane review suggests that screening or case-nding instruments are a simple, quick and inexpensive method to improve detection and management of depression in non-specialist settings, such as primary care and the general hospital. 16 This approach has important implications for clinical practice. The value of screening or case-nding questionnaires should be considered as a triage tests approach rather than a replacement test to existing methods of assessment 24 . Triage tests are simple and noninvasive, have no waiting time and do not aim to improve the diagnostic pathway, but instead they reduce the number of patients who need further assessment. The bene t of using the two case nding questions in clinical settings is not necessarily to diagnose perinatal depression, but to reduce the number of women who need extensive clinical assessment or evaluation with much longer questionnaires, such as the Edinburgh Postnatal Depression Scale or the Patient Health Questionnaire-9 by more than 50%. 13 Overall, the translated Chinese Whooley questions showed high speci city and moderate sensitivity compared to other studies. 13,29 This is consistent with other studies due to the homogenous sample, which may lead to lowering sensitivity and higher speci city. It has been suggested that using the Whooley questions in primary care or community settings may lead to lowering sensitivity and higher speci city compared to specialist hospital setting due to primary care needing to deal with a more diverse population on a daily basis. 30 The speci city of the two case nding questions in our antenatal (96.6%) and postnatal validation study (86.2%) provide further evidence of a simple approach to identify perinatal depression.
The strength of the study was the use of the gold standard comparison so this validation study is con dent in the integrity of diagnosis and the follow-up enabled us to examine the stability of the Whooley questions. Limitations of the study was the small sample size as unanticipated due to the social unrest and the pandemic COVID-19 and we had to stop recruitment which was unavoidable. Other limitations include the homogenous sample as we recruited from one study site and limited to the third trimester and rst three postnatal months. Therefore, further research is warranted with a larger sample size involving diverse perinatal populations and longer postnatal follow up. Lastly, the effect of the questions on outcomes of perinatal care warrants evaluation.

Conclusion
Overall, the translated Chinese Whooley questionnaire appears to have acceptable diagnostic accuracy and can be implemented into health services among Chinese population as only requiring two yes/no questions. The next stage is to research approaches using the Whooley questions to identify, treat and/or prevent depression in Chinese pregnant women and new mothers.  Participant ow chart of the study