Travel diary is one of the most important and widespread methods to collect data critical for transportation modelling and demand analysis. Over time, travel survey methods have evolved dramatically. Initially, travel surveys were conducted physically through pen and paper and mail-back forms (Stopher, 1992). Then, as technology progressed, surveys could be administered through the computer-assisted telephone interviewing (CATI) method (Axhausen et al., 2002). After the internet became widely available, surveys shifted to the computer-assisted web interviewing (CAWI) method (Adler et al., 2002). All these methods require each respondent to recall their trips in the form of a self-reported travel diary. The latest trip diary reporting technology uses global positioning systems (GPS) and cellular-based methods. Using this technology, travel diaries are passively logged while each respondent conducts their daily activities (Murakami & Wagner, 1999; Chen et al., 2010; Kelly et al., 2013; Shen & Stopher, 2014; Li et al., 2023).
Household travel surveys using self-reported methods (e.g., CATI and CAWI) suffered from recall and proxy biases. Both biases lead to underreporting of travel demand in the dataset. Recall bias occurs when survey participants tend to forget the trips they have made (Stopher & Greaves, 2007). Proxy bias arises in household travel surveys that collect responses from a single household member on behalf of others. This method assumes that the respondent fully knows all household members’ travel activities, which could often be unrealistic. This assumption leads to proxy biases, resulting in the under-reporting of trips made by other household members (Morency, 2015; Weir et al., 2011). On the other hand, GPS-based diary collection methods are exempted from recall and proxy biases by passively collecting travel behaviour data (Stopher & Greaves, 2007)
Thus, this study will investigate the recall and proxy bias in self-reported travel surveys and propose correction procedures. The investigation and correction will be conducted under the core-satellite paradigm of urban passenger travel surveys (Goulias et al., 2013; Habib et al., 2018). The core-satellite paradigm considers large-scale household travel surveys as the core dataset and uses satellite surveys targeting specialized purposes to supplement the core. The study uses the Transportation Tomorrow survey (TTS), which is a regional household travel survey collected in the Greater Toronto and Hamilton area (GTHA), as the core dataset (Transportation Tomorrow survey, 2024). Meanwhile, the Google Timeline Travel Survey (GTTS) will be used as the satellite survey. The GTTS is a GPS-based travel survey in the GTHA (Li et al., 2023). Data collected from the self-reported TTS survey will be compared with GTTS. Then, a correction procedure will be developed based on identifying biases in trip rates in the core dataset using adjustment techniques founded on common variables in each dataset.
The remainder of the paper is outlined as follows. Section 2 presents literature reviews on survey biases. Section 3 reports descriptive statistics for the datasets used. Section 4 presents the results of comparing the core and satellite surveys. Section 5 presents the procedure to correct the core survey based on empirical analysis results. Finally, Section 6 concludes the study by stating key findings and plans for future work.