Validation of a Food Frequency Questionnaire (FFQ) for Assessing Dietary Lignans Intake in Hong Kong Chinese Women with Early-Stage Breast Cancer

Lignans intake has been suggested to have effects on breast cancer outcomes based on some studies from the West. However, the relationship of lignans consumption and breast cancer outcomes in Chinese population has not been investigated. The development of a culture-specic lignan food frequency questionnaire (LFFQ) is necessary in order to assess dietary lignans intake among Chinese breast cancer patients. This study was designed to validate a newly developed LFFQ to specially assess lignans intake among Chinese subjects.


Background
Breast cancer has become the most common cancer and the most common cause of cancer-related mortality among women worldwide [1]. In China, the incidence of breast cancer is increasing annually by 1.7-2.8% from 1976 to 2009 [2], which constitutes a major breast cancer population in Asia due to her large population.
Lignans are diphenolic compounds that are widely distributed in the plant-based foods such as cereals, vegetables, nuts and oilseeds, and beverages such as coffee, tea and wine [3,4]. Secoisolariciresinol (SECO), matairesinol (MAT), pinoresinol (PIN), and lariciresinol (LAR) are the four major plant lignans precursors in foods [5]. Moreover, lignans have a number of biological activities that could affect the progression of breast cancer [6]. In vitro and vivo studies suggested that lignans may improve the e cacy of tamoxifen [7], inhibit cancer cell growth [8], and decrease the activity of aromatase [9].
Epidemiological evidence has suggested that the association of lignans consumption with reduced breast cancer risk [10][11][12][13][14] and breast cancer mortality [15][16][17]. However, reports have been only conducted in Western countries and no study has been conducted in Chinese population. Additionally, lignans have also been reported to be associated with other chronic illness such as cardiovascular diseases (CVD) and malignancies [18][19][20][21][22][23]. Variation in dietary lignans intake may exists between different geographical populations and different ethnicities.
Food frequency questionnaire (FFQ) has been used for assessing habitual dietary intake. FFQ assessment could determine the frequency of consumption of foods over a relatively long reference period, and has been the most widely used method in epidemiologic studies [24]. Although two crosssectional studies have used FFQ among Chinese population for assessing dietary lignans exposure, both of the studies had used FFQs that were established for Western population [25,26]. In order to assess the association of lignans intake and health outcomes, a culture-speci c validated FFQ that assesses the amount of dietary lignans based on local food in the geographical region is needed.
In this study, a newly developed FFQ to evaluate dietary lignans intake among Hong Kong Chinese women with early-stage breast cancer is being validated.

Patients and study design
Study subjects were recruited from an original cohort of 1,462 Hong Kong Chinese women with earlystage breast cancer who participated in a 5-year prospective cohort study -The Hong Kong New Territories East Cluster (NTEC)-Kowloon West Cluster (KWC) Breast Cancer Survival Study, in short HKNKBCSS. The HKNKBCSS was designed to investigate the relationships between phytoestrogens and other lifestyle factors and breast cancer outcomes. Brie y, 1,462 patients were recruited from Prince of Wales Hospital and Princess of Margaret Hospital and were interviewed at baseline (within 12 months after diagnosis), 18-month, 36-month and 60-month follow-up after breast cancer diagnosis. Details of the prospective cohort study have also been described in previous publication [27].
Between September 2015 and September 2017, individual patient was invited to take part in a 1-year validation study. A total of 133 patients provided written consent and participated in the study. According to the study protocol, consented patients were interviewed in person at two time-points: T1, which was a 36-month follow-up after breast cancer diagnosis (conducted between 30 and 42 months postdiagnosis); and T2, which was a 48-month follow-up after breast cancer diagnosis (conducted between 42 and 54 months postdiagnosis). Each patient also underwent twelve monthly 24-hour dietary recalls (24-h DRs) based on phone interview during the 1-year period (between T1 and T2) ( Supplementary Fig. 1 The lignan food frequency questionnaire (LFFQ) was constructed to assess habitual dietary lignans intake in Hong Kong Chinese women with early-stage breast cancer. Firstly, food groups/items were selected from a previously validated 109-item FFQ that assessed the overall dietary intake and soy intake of Hong Kong Chinese women [28]. This FFQ has been used for assessing overall dietary soy intake at baseline and at 18-month follow-up. From this FFQ, plant foods consumed by 80% of our study population based on 18-month follow-up were identi ed; these include 13 food groups/items: rice, noodles, white-wheat bread, dark-green leafy vegetables, light-green leafy vegetables, gourd vegetables, root vegetables, citrus fruits, drupe fruits, melons, banana, cooking oil and tea. Secondly, commonly consumed lignan-rich foods by Western population such as nuts and seeds, oils, cereals were identi ed based on published lignans databases [29][30][31][32][33][34]; these include 13 food groups/items: whole-wheat bread, legumes, nuts, brown rice, red rice, asparagus, chestnut, peanut, pepitas, sun ower seed, axseed, sesame seed and axseed oil (the rst three food groups/items were also included in the original 109-item FFQ).
Lastly, to ensure comprehensiveness of foods to be included, foods which are major sources of lignans intake among other cohorts based on previous studies [25,26,35] were also selected; these include 11 food groups/items: radish, peas, corn, grapes, mung bean sprout, carrot, squash, sweet pepper, garlic, cherries, strawberry (the rst four food groups/items were also included in the original 109-item FFQ).
As a result, the newly developed LFFQ consists of 37 food groups or food items. Of which, 20 food groups/items were derived from the original 109-item FFQ, and 17 food groups/items were newly added ( Table 1).  [29,30], which contain the four main plant lignans precursors (SECO, MAT, PIN and LAR) and total lignans content. Primarily, lignans content of 13 food groups/ items in the LFFQ were quanti ed based on Japanese lignans database, which is a unique database consisting of 86 Asian food items [30].
Lignans contents of another 13 food groups/items were obtained from the Dutch lignans database which included 109 food items from all plant food groups [29]. Lignans content of 7 food groups were the average lignans value of all the food items in the group. Finally, for the 4 food items which had no direct lignans value that could be found in the above two databases, total lignans value were estimated by adopting values for similar foods. The details of lignans values of 37 food groups/items were shown in Table 2. All lignans values of food groups/items were adopted or converted as µg/100 g cooked weight basis according to Chinese Food Composition Table [36] for further analysis, i.e. the conversion proportion of raw noodles versus cooked noodles was 3:1, raw vegetable versus cooked vegetables was 2:1. Note: 1 lignans value was shown as µg/100 g cooked weight basis; 2 lignans value was adopted same as "brown rice"; 3 lignans value was converted to µg/100 g cooked weight basis in the analysis; 4 lignans value of the food groups were the average lignans value of all the food items in the group [29,30] and converted to cooked weight basis; 5 lignans value was adopted same as "nuts"; 6 lignans value was adopted same as "sun ower seed"; 7 lignans value was adopted same as "cooking oil".

No. Food groups
Lignans values used in our study (µg/100 g cooked weight) Japanese database (µg/100 g fresh weight) [30] Dutch database (µg/100 g fresh weight) [29] Food items included in the group Note: 1 lignans value was shown as µg/100 g cooked weight basis; 2 lignans value was adopted same as "brown rice"; 3 lignans value was converted to µg/100 g cooked weight basis in the analysis; 4 lignans value of the food groups were the average lignans value of all the food items in the group [29,30] and converted to cooked weight basis; 5 lignans value was adopted same as "nuts"; 6 lignans value was adopted same as "sun ower seed"; 7 lignans value was adopted same as "cooking oil".

No. Food groups
Lignans values used in our study (µg/100 g cooked weight) Japanese database (µg/100 g fresh weight) [30] Dutch database (µg/100 g fresh weight) [29] Food items included in the group Note: 1 lignans value was shown as µg/100 g cooked weight basis; 2 lignans value was adopted same as "brown rice"; 3 lignans value was converted to µg/100 g cooked weight basis in the analysis; 4 lignans value of the food groups were the average lignans value of all the food items in the group [29,30] and converted to cooked weight basis; 5 lignans value was adopted same as "nuts"; 6 lignans value was adopted same as "sun ower seed"; 7 lignans value was adopted same as "cooking oil".

Administration of FFQs
Two face-to-face interviewer-administrated questionnaires were used during the interviews at T1 and T2: 1) a previously validated 109-item FFQ for collecting overall dietary intake; 2) the newly developed LFFQ for assessing lignans intake. Since there were 20 food groups/items in the LFFQ selected from the original 109-item FFQ, participants were interviewed to assess a net total of 126 food groups/items for total dietary or energy intake. Participants were asked to report the frequency of usual diet intake during the previous year (as the number of times weekly, monthly, yearly or never consumed) and the average amount of food consumed each time. Seasonal variation of food intake was also taken into account by asking the participants to report intake frequency during both in-season and off-season periods for each item [37]. A series of food pictures with speci ed food portions were presented to subjects to facilitate recalls of dietary intakes. Different sizes of commonly used utensils such as bowls (small, medium, or large) and spoons (teaspoon or tablespoon) were also provided to participants for the estimation of portion sizes.

24-Hour dietary recalls
During the 12-month interview between T1 and T2, 24-h DRs were conducted on a monthly basis. Following the initial visit to the hospitals at T1, a picture of different sizes of commonly used utensils was also provided for participants to help estimate the amount of foods consumed during the DR interview. The 24-h DRs were conducted by telephone, and participants were asked to recall all their food intakes including drinks and snacks for the previous 24 hours. Detailed descriptions of each food, including cooking methods and recipes were asked if possible. Twelve 24-h DRs were collected from each participant monthly to elucidate seasonal variation in food over the study period. To take into account the food differences between weekdays and weekends, eight weekdays and four weekend days were randomly selected and conducted for each participant [38].
Details about each reported food in every recall were reorganized and entered into the database for estimation of lignans intake according to the 37 food groups/items included in the LFFQ. Composite dishes were separately recorded based on raw materials including ingredients (i.e. oils). Lignans intake of each participant was derived from the means of the twelve 24-h DRs.

Validation procedures
The reproducibility of the LFFQ was assessed by calculating the reliability of dietary lignans intake collected from LFFQ conducted at T1 and T2 (LFFQ1 and LFFQ2 same below). The validity of the study LFFQ was assessed by calculating Pearson's correlation coe cient between dietary lignans intake collected from LFFQ2 and the mean value derived from twelve 24-h DRs during the same 1-year validation period.

Sample size estimation
In this study, the sample size calculation was based on a similar validation study reported previously [37]. The correlation coe cient of dietary lignans intake estimated by FFQ and multiple monthly 24-h DRs was estimated to be 0.50 ~ 0.70 [37]. Assuming a true correlation of 0.60, with 5% ( =0.05) level of signi cance, 80% power (β = 0.2) and 80% completion rate, the sample size required was calculated to be 92 subjects in order to ensure that the lower limit of the 95% con dence interval (CI) of the observed correlation coe cient would be at least 0.40.

Statistical Analyses
Participants who did not complete all the assessments (n = 29) or reported improbable dietary intake (estimated energy intake was less than 500 or larger than 4,000 kcal/day; n = 1) were excluded, resulting in 103 patients (77%) completing 1,236 DRs in the current analysis.
Student's t-test (continuous data) and chi-squared test (categorical data) were used for within-subject comparison of the characteristics of participants between the original cohort and the participants in this study.
Median and interquartile range (IQR) for total lignans intakes were computed from LFFQ1, LFFQ2 and from the means of twelve 24-h DRs to compute the daily lignans intake respectively. Wilcoxon signed rank tests were used to determine whether the median of dietary lignans intake were signi cantly different between LFFQ1 vs LFFQ2, and LFFQ2 vs 24-h DRs.
Bland-Altman method was used to evaluate the agreement between the 2 dietary intake methods (LFFQ and 24-h DRs). With this method, the differences and the mean values of lignans intakes obtained from the 2 methods for each participants were plotted. Spearman rank correlation was also performed to examine whether the differences were signi cantly correlated with the means between the 2 methods. 95% limits of agreement were established (mean ± 1.96SD) to de ne the expected range that most of the differences between methods are within [39].
To evaluate the validity of the LFFQ, Pearson's correlation coe cient was examined between LFFQ2 and the mean values derived from the 24-h DRs. To evaluate the reproducibility between LFFQ1 and LFFQ2, an intraclass correlation coe cient (ICC) was computed using a 1-way random test ANOVA for a single measurement. An ICC larger than 0.76 represents good test-retest reliability and an ICC less than 0.4 indicates poor test-retest reliability [40].
Lignans values obtained from the two LFFQs and through the 24-h DRs were also divided into quartiles derived separately for each dietary assessment method. The proportion of subjects correctly classi ed into the same, adjacent, and extreme quartiles was examined.
All statistical analyses were performed using SPSS 25.0. All tests were 2-sided and P < 0.05 was considered signi cant.

Result
The demographic characteristics of the original cohort and study participants who completed FFQ at T1 were shown in Table 3. The characteristics of two groups were similar. The mean age of participants when they joined this study was 53.79 years, and the majority of them (76.5%) were married or cohabitation. The mean total energy intake was 1395 kcal/day and about 40% of the participants were in normal weight range (BMI = 18.5-22.9 kg/m 2 ), while over half of them were overweight or obese (BMI ≥ 23 kg/m 2 ). About 60% had achieved high school level of education.  Table 4, median lignans intake of LFFQ1 was signi cantly higher than that of LFFQ2 (P < 0.05; Wilcoxon signed rank test); similarly, median lignans intake measured by LFFQ2 was signi cantly higher than that of 24-h DRs (P < 0.001; Wilcoxon signed rank test). Because the distributions of lignans intakes were rightskewed, log transformation was performed to improve the normality of the distribution before analysis. However, median lignans intake between LFFQs and between LFFQ2 and 24-h DRs were both signi cantly different even after log transformation (Table 4). DRs, and we found that 97 individuals between-method differences (94.2%) were located within 95% limits of agreement in the Bland-Altman plot; however, the Spearman rank correlation coe cient for the difference and mean of log-transformed total lignans intake obtained from the 2 methods was r = 0.27 (P < 0.01), the difference between methods were signi cantly correlated to the means obtained from the 2 methods, which meant that there was proportional bias between the 2 methods for log-transformed lignans intake.
To further eliminate the proportional bias of individuals between the 2 methods, total lignans intake was adjusted by total energy intake. Total energy intake of each participant was calculated based on the average daily consumption derived from the FFQs consisted of 126 food groups/items and 24-h DRs according to the Chinese Food Composition Table [36]. Energy-adjusted total lignans intake was computed as followed: To enhance the distribution normality, log transformation was also executed to energy-adjusted total lignans intake for further analysis. As shown in Table 4, the median intake (IQR) of energy-adjusted total lignans after log transformation for LFFQ1, LFFQ2 and 24-h DRs were 3.08 (2.95-3.21), 3.04 (2.89-3.18) and 2.95 (2.80-3.05) µg/1000kcal/day, respectively. The median lignans intake did not differ between LFFQ1 and LFFQ2; though median lignans intake of LFFQ2 was signi cantly higher than that of 24-h DRs (P<0.001; Wilcoxon signed rank test), the difference was decreased when compared to unadjusted total lignans intake.
Furthermore, the Bland-Altman plot of energy-adjusted total lignans intake after log transformation of LFFQ2 and mean value derived from twelve 24-h DRs was shown in Figure 1. The 95% limits of agreement of individual differences of energy-adjusted total lignans intake between the 2 dietary methods were narrower than the unadjusted one, and only 3 individuals (2.9%) lied above or below the 95% limits of agreement. Though the Spearman rank correlation coe cient for the difference and mean of energy-adjusted total lignans intake obtained from the 2 methods was r=0.20 (P=0.04<0.05), the difference between methods were less correlated to the means obtained from the 2 methods when compared to the unadjusted total lignans intake.
In this study, the Pearson's correlation coe cient between LFFQ2 and 24-h DRs for energy-adjusted lignans intake after log transformation were 0.67 (P<0.001), which indicated that there was signi cant correlation of lignans intake conducted by the two methods ( Figure 2). The ICC between lignans intake assessed by LFFQ1 and LFFQ2 were 0.43 (P<0.001) for energy-adjusted lignans intake after log transformation, which showed moderate level of reliability between the two LFFQs ( Figure 2).
As shown in Table 5, when comparing the energy-adjusted total lignans intake after log transformation between LFFQ2 and 24-h DRs, 42.7% of participants were classi ed into the same quartiles, 56.3% were classi ed into the adjacent quartiles, and only 1.0% were classi ed into extreme quartiles. Similarly, when comparing LFFQ1 and LFFQ2, more than 95% of participants were classi ed into either the same or adjacent quartiles, and only 4.9% of participants were classi ed into extreme quartiles.

Discussion
This study examined the validity and reproducibility of a newly developed LFFQ for assessing the habitual total lignans intake among mid-life Hong Kong Chinese women with early-stage breast cancer.
In our study, the median intake of total lignans estimated by LFFQ1 was higher than LFFQ2, which was consistent with a validation study by Shu et al. [41] of assessing dietary nutrient and food intake and demonstrating that this may be due to the decreasing energy needs with increasing age. Another reason for this may be that participants were more familiar with the questionnaire and they could report the foods and quantify the portion size more clearly during the second interview [28]. Three studies [42][43][44] have also shown generally higher dietary nutrient or food intake conducted in FFQ1 than that derived from FFQ2, but the differences of the majority of food groups or nutrients were not signi cant. In contrast, opposite nding has been reported in the study of Fornés et al. [45] when assessing dietary nutrient intake among Brazilian workers.
The median total lignans intake derived from LFFQs were higher than that from 24-h DRs, which indicated the dietary intake could be overestimated by FFQ. Similar ndings has been found in a validation study by Chan et al. [46] when assessing soy intake, which also stated more overestimation in median dietary nutrient intake by FFQ than that obtained from 24-h DR. One of the reasons for this may be due to the fact that the number of food items included in the FFQ was more than that of the 24-h DR over a one-year reference period [28]. Two other studies [43,47] have also described that overestimation of intake by FFQ may result from summarizing or averaging the foods consumed during a year, while 24h DR estimated a shorter time interval and may thus cause less overestimation.
A correlation of 0.67 of lignans intake obtained by LFFQ and 24-h DR was found; this result was similar to the previous validation studies conducted in Western countries which assessed lignans intake by LFFQ using 24-h DR as reference method [37,47,48]. The reproducibility of 0.43 of the two LFFQs conducted at 1-year interval in our study was also similar to other studies that examined the reproducibility of food consumption and nutrient intake between FFQs with the same 1-year period [28,41,43,49]; however, the correlation between LFFQs was not as high as some previous studies [44,46], which may be due to limited sample size of our study.
The Bland-Altman analysis of our study showed that the difference of energy-adjusted total lignans intake after log transformation was less related to the mean derived from the two dietary methods, because the Spearman rank correlation of the difference and the mean obtained from the two methods (LFFQ and 24-h DR) was quite low, and the 95% limits of agreement was narrower when compared to the total lignans intake without adjustment by total energy intake. This phenomenon revealed that it would be useful to adjust dietary intake with total energy intake before formal analysis to enhance the normality and accuracy. This nding has also been reported in some previous studies [28,46].
Correct classi cation of the individuals between groups is essential in the epidemiological studies [50]. We evaluated the extent to which total lignans intake as derived from the LFFQs and the 24-h DRs have classi ed the subjects into same quartiles of distribution. The percentages of participants classi ed into the same quartiles were 42.7% for energy-adjusted lignans intake between the two methods (LFFQ and 24-h DR); similarly, 37.9% of participants were classi ed into the same quartiles between the two LFFQs.
The percentage of participants classi ed into extreme quartiles ranged from 1.0-4.9%. This level of agreement indicates that the study LFFQ has a satisfactory ability to categorize subjects for lignans intake.
Our study had some limitations. First, our study had not incorporated the measurement of biomarkers.
Biomarkers of dietary intake can be used as reference instruments in validation studies, because biomarkers of diet do not rely on reporting by subjects and the measurement errors can generally be assumed to be uncorrelated with those self-reported dietary assessment methods such as FFQ and 24-h DR [51]. In a previous study that investigated the relationship between dietary intake and urinary phytoestrogen among Mexican women, Chávez-Suárez et al. [52]  quantifying methods and the number of lignans precursors identi ed [25, 29-31, 33, 34, 53-56]. Moreover, the lignans value of some food groups/items in the LFFQ were adopted by applying values from similar foods due to the lack of actual lignans content in published databases; these could impact on the evaluation of the LFFQ. Third, the LFFQ was designed to assess the subjects' habitual intake during the past year whereas the 24-h DR reported the actual intake at different time points over the same 1-year period. For 24-h DR, day-to-day variation in intake could occur because subjects may consume lignan-rich foods on one day but not on the other day, and they were interviewed by phone calls without any noti cations. Additionally, in terms of subjects with high lignans intake, quantity insu ciency of 24-h DRs and inaccuracy in estimation of actual intake or portion size could lead to discrepancy between estimated intakes derived from the 2 assessment methods; this demonstrated that within-subject bias might exist when dietary intake was estimated at relatively high levels between methods.
Despite the limitations stated above, our study is the rst study to validate an FFQ for assessing habitual lignans intake among Chinese breast cancer patients. This is important as it provides a fundamental tool for future epidemiological studies in investigating the relationship between dietary lignans and chronic diseases among Chinese population. Phytoestrogens such as iso avones and lignans have been reported to have numerous effects on human health [57][58][59][60][61][62][63]; however, most studies were conducted mainly in Western countries, and those that have been conducted in Asian population have only reported on iso avones but not lignans. Interest in lignans has been motivated by epidemiological research investigating the association between lignans and chronic diseases. For instance, some cohort studies have demonstrated that higher lignans intake were related to 11%-29% risk reduction in CVD in Western population [64][65][66][67]. It would also be of interest to investigate if similar association exist among Chinese subjects in future studies.

Conclusion
In this study, the newly developed LFFQ has been validated against 24-h DRs. It has shown to have acceptable validity and reproducibility and thus could be regarded as a reasonable and useful instrument in the assessment of dietary lignans intake among Hong Kong Chinese women with early-stage breast cancer. Future studies of this cohort would be essential to investigate the relationship between lignans intake and breast cancer outcome and other chronic diseases among Chinese survivors. Authority.

Availability of data and materials
The dataset supporting the conclusions of this article is included within the article.

Ethics approval and consent to participate
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. Written informed consent was obtained from all participants.

Consent for publication
Not applicable.

Competing interests
The authors declare that they have no competing interests.