Critical speed, D’ and pacing in swimming: Reliability of a popular critical speed protocol applied to all four strokes

Purpose: The validity of the critical speed (CS) concept has been investigated in front crawl swimmers using protocols involving multiple performance trials. The reliability and practical feasibility of CS protocols with strong face validity remain unknown in all four swimming strokes. This study aimed to assess reliability and practical feasibility of a widely used CS protocol in all four strokes. Methods: 32 national-level swimmers (buttery n=7, 19 ± 2 years old; backstroke n=8, 18 ± 2 years old; breaststroke n=7, 18 ± 2 years old; front crawl n=10, 17 ± 2 years old) performed three 200-m and three 400-m performance trials in their specialist stroke over a three-week period. CS and supra-CS distance capacity (D’) were modelled from the linear relationship between distance and time. At the end of the three weeks, all swimmers were asked whether they felt they could or would want to complete an 800-m performance trial as part of a CS protocol. Results: CS derived from 200-m and 400-m performance trials is reliable (typical error ≤ 0.04 m.s -1 ; coecient of variation < 4% for all strokes) while D’ is not (typical error between 4 to 9 m; coecient of variation 13-45%) . Response rate to the follow-up questions was 100%. Only a few buttery swimmers would want to (14%) or felt they could (29%) complete an 800-m performance trial with more positive responses for breaststroke (57% and 71%), backstroke (75% and 100%), and front crawl swimmers (90% and 100%). Conclusion: Using 200-m and 400-m performance trials is a reliable and practical method for determining CS in backstroke, breaststroke and front crawl swimmers. Including an 800-m performance trial would not be practical with buttery swimmers, would be challenging with breaststroke swimmers, but would be feasible with front crawl and backstroke swimmers.


Introduction
The concept of critical power (CP) was rst used to model muscular fatigue of synergistic muscle groups. 1 The concept has since been applied and validated for whole body exercise and linear sports. In linear sports where direct measurement of power is not feasible a critical speed (CS) has instead been used to model fatigue. In this model, CS is de ned mathematically as an intensity that could be maintained inde nitely as represented by the asymptote of the hyperbolic speed-duration relationship.
Physiologically, CS would represent the upper boundary of the heavy exercise intensity domain, separating an exercise intensity at which physiological homeostasis can be maintained from an exercise intensity at which it cannot 2 . CS is attractive as a tool for the design of training sets and physiological pro ling of athletes 2 , due to its robust validity and ease in determination in the aquatic eld 3 . The second parameter of a two-parameter CS model, supra-CS distance capacity (D'), is de ned as the work that can be done above CS 4 and is represented by the curvature constant of this hyperbolic relationship. This curvilinear relationship can be transformed into a linear relationship by plotting distance against time (dt), where CS is equal to the slope of the regression line and D' is equal to the y-intercept 2 (Figure 1).
Traditional CS protocols involve the completion of multiple performance trial efforts performed on separate days. For the most valid estimation of CS, it is recommended that 3-7 performance trial efforts lasting ~2 to ~15 minutes are used, allowing for VȮ 2max to be reached in each trial, with at least 5 minutes difference in duration between the shortest and longest trials 5,6 . To apply the CS concept as an index of endurance performance in front crawl swimming 7   recommended swimming a range of distances between 200-m to 1,500-m, However, such criteria pose practical challenges in a high-performance sport environment as completing more and longer trials causes further fatigue and disruption to an athlete's training schedule. To make CS estimation more practically feasible some researchers and practitioners have used only two relatively short distance performance trial distances, commonly a 200-m and 400-m combination 8,9 . The reliability of this procedure is currently unknown; laboratory-based and eld-based test-retest design studies suggest CP to be more reliable than W' ( [10][11][12]. With regards to validity, the 200-m and 400-m combination leads to lower CS estimates than if longer trials were inserted in the d-t model ( 13-15 ). This is in part because of the relatively short trial durations but also because in swimming, the exponential relationship between energy cost and swimming speed 15 affects the linearity of the d-t relationship. Including trials of ≥800-m will likely produce a slower CS with more criterion validity 9,13,14 than just performing shorter trials but this comes at the cost of practical feasibility.
CS testing in 'form stroke' swimming (i.e. butter y, backstroke or breaststroke) poses further unique challenges that may affect the reliability, validity and practical feasibility of CS estimation. Unlike front crawl swimmers who compete in distances up to 1,500-m in the pool, form stroke swimmers only race distances up to 200-m. It is also reported that swimming coaches only prescribe 22-36% of a competitive form-stroke swimmers' training volume in their specialist stroke, with most training being front crawl 20 .
Furthermore, the actual energetic cost of swimming form-strokes, especially simultaneous strokes (i.e. butter y and breaststroke), is signi cantly greater than front crawl swimming across all speeds 16 .
Together these factors may mean that form stroke swimmers, particularly simultaneous stroke swimmers, are less capable and/or less motivated to complete CS protocols that include longer performance trial efforts in their competitive stroke. To this author's knowledge, only one study has investigated the use of the 'traditional' CS protocols in a form-stroke 17 . In this study, a CS protocol using 50-m, 300-m and 2,000-m performance trials in breaststroke swimmers was validated against a discontinuous Maximal Lactate Steady State (MLSS) protocol (5 x 400-m). The short rest periods between 400-m efforts at MLSS may have enhanced blood lactate removal, leading to the observation of steady state blood lactate concentrations at 100% but not 102% of estimated CS. Variability in front crawl swimmers' pacing pro les is known to increase as performance distance increases 18 and variability in performance appears to be exacerbated in form-strokes 19 with greater coe cient of variation (CoV) of split times in the pacing of 200-m breaststroke, butter y and backstroke than 200-m front crawl. Together these factors could contribute to lower reliability of pacing when swimming a form stroke, particularly over longer distance efforts, which could in turn affect performance and consequently CS/D' reliability 20 . However, presently, the actual reliability of CS/D' estimation from protocols performed in the eld is unknown in any stroke.
The aim of this study was to assess the reliability of CS, D' and pacing pro les calculated using 200-m and 400-m performance trials for national standard butter y, backstroke, breaststroke, and front crawl swimmers in their primary stroke. The study also aimed to identify the practical feasibility of implementing even longer distance efforts in a CS protocol. We hypothesised that (1) absolute and relative reliability of CS would be good (CoV ≤5%, ICC ≥0.75) in front crawl swimmers; (2) D' would not be a reliable parameter in any stroke; (3) simultaneous stroke swimmers would be less likely than front crawl swimmers to feel that they could or would want to complete an 800-m trial.

Methods
Thirty-two competitive national standard swimmers provided written informed consent to participate in this study approved by the University of Brighton SASM Research Ethics Committee with experimental procedures conducted in accordance with the Declaration of Helsinki, except for prior registration in a database. Participant characteristics are presented in Table 1. All participants met the selection criteria of being 16 years or older and having achieved a 2018 British Swimming Championships qualifying or consideration time. Over the course of three weeks, participants were required to perform three 200-m and three 400-m performance trials separated all by at least 48 hours, with the order assigned using block randomisation (http://www.randomization.com). Trials were completed in each swimmer's specialist stroke; individual medley swimmers chose whether to swim butter y, backstroke, breaststroke or front crawl. To minimise circadian variations in performance 22 , participants performed all their performance trials at the same time of day. Performance trials were completed in a 25-m pool from a dive start following a standardised 1km warm up. To maximise performance, participants were permitted to eat and drink as desired in the 24 hours prior to the rst trial, this intake was then standardised across all subsequent performance trials. Participants provided a urine sample before each trial for the assessment of urine osmolality (Osmocheck; Vitech Scienti c, Horsham, United Kingdom). No feedback was given during or immediately after the performance trials, each participant received summary reports of their performance trials once all testing has been completed.
Prior to testing, lane ropes were xed using 5mm stainless steel lane rope clamps (WRST-05; S3i Group, Doncaster, United Kingdom) and calibrated using Class III Accuracy 50-m measuring tape (Surveyors Tape; Draper Tools, Chandler's Ford, United Kingdom). Each swim was recorded on a video camera (HC-X1000; Panasonic, Osaka, Japan) with analysis of lap splits performed retrospectively using proprietary analysis software 23 . Height (cm) and weight (kg) were measured using a wall-mounted stadiometer (Harpenden Stadiometer; Holtain Limited, Crymych, United Kingdom) Hamburg, Germany) and at scales (876; SECA, Hamburg, Germany) respectively by a Level 1 International Society for the Advancement and Kinanthropometry certi ed practitioner.
Having completed all performance trials, participants were asked to respond via e-mail to two questions with either a "Yes" or "No" answer: (1) "Could you have completed a full 800m effort in your stroke?"; (2) "Would you want to swim a 200-m, 400-m and 800-m effort over three separate days in order to have a measure of your critical speed and D'?"

Statistics
The SPSS software package (version 24, SPSS, Chicago, IL) was used for statistical analysis with data presented as means ± SD unless otherwise stated. Outliers and normality of distribution were examined using boxplots and the Shapiro-Wilks test respectively. Outliers were windsorized to the next highest value prior to further analysis. Violations of normality were kept in and are reported in the results section. A oneway repeated measures ANOVA was used to assess differences between the three trials. Sphericity was checked using Mauchly's test, when the assumption of sphericity was violated signi cance was examined using Greenhouse-Geisser correction. Bonferroni correction was performed for all post-hoc analysis where the assumption of sphericity was not violated. Supplementary statistical data is presented in Appendix 1.
A published spreadsheet was used to calculate typical error of measurement (TEM), a measure of absolute test-retest reliability (Hopkins, 2007). TEM was divided by the trial mean and multiplied by 100 to calculate coe cient of variation (CoV). Smallest detectable individual change (SDC ind ) and smallest detectable group change (SDC group ) values were calculated from mean TEM to 95% probability using equations 1 and 2, respectively. The smallest worthwhile change (SWC) was calculated by dividing between-subject SD values by 0.2.
Relative reliability was assessed through intra-class correlation coe cient (ICC) estimates and their 95% con dence intervals were calculated based on a single-rater, absolute agreement, two-way mixed effects model (ICC 2,1). 24 ICC values < 0.50 were considered indicative of poor reliability, 0.50-0.74 considered moderate reliability, 0.75-0.89 considered good reliability and ≥ 0.90 considered excellent reliability. 24 A Fisher's exact test was performed to assess responses to follow-up questions regarding the feasibility of an 800-m performance trial.

Performance trials
There were no differences in participants' urine osmolality across their three 200-m or 400-m performance trials (p > 0.05; Table 1). Performance trial data was normally distributed and contained no outliers. There were no differences in total time over the three 200-m or 400-m performance trials for any of the four strokes tested (p > 0.05; Appendix 1). Tables 2 and 3 present TEM and CoV for both performance trials.  There were no signi cant differences in CS or D' across trials 1, 2 and 3 for any of the strokes (p>0.05; Appendix 1). Reliability measures for CS and D' are presented in Table 4 and Table 5, respectively. CS CoV was between 2.35-3.52% for all trials except butter y trials 1-2 (0.95%). ICC analysis showed a moderate to excellent relative reliability between CS calculated over the three sets of 200-m and 400-m performance trials (ICC ≥ 0.70). ICC analysis for D' revealed poor relative reliability for backstroke, breaststroke and front crawl swimmers (ICC ≤ 0.70), but good relative reliability for butter y swimmers (ICC = 0.76).  Figure 2 shows the percentage of swimmers who believed that they "could have" and "would have wanted to" complete an 800-m performance trial in order to assess their CS. Proportions of swimmers who stated that they could (p = 0.001) or would want to (p = 0.013) complete an 800-m performance trial was in uenced on the group level by stroke. A smaller proportion of butter y swimmers (28.6%) stated that they could have completed an 800-m performance trial compared to backstroke (100%) and front crawl (100%) swimmers (p< 0.01). Butter y swimmers (14.3%) were also less likely to say that they would want to complete an 800-m trial as part of a protocol to estimate their CS when compared to backstroke (75%) (p = 0.041) and front crawl (90%) (p = 0.004) swimmers. There were no signi cant differences in the proportion of breaststroke swimmers who said that they could or would want to complete an 800-m performance trial when compared to swimmers of any other stroke (p > 0.05).

Discussion
Reliability has been de ned as "the amount of error that is deemed acceptable for the effective practical use of a measurement tool". 25 It is to this end that this study attempted to de ne both the absolute (i.e. degree to which repeated measurements vary for individuals) and relative reliability (i.e. degree to which individuals maintain their position in a sample with repeated measurements).
CS/D' estimation in front crawl swimming has previously been assumed to be reliable because front crawl competition distance performance trials of 200 m to 800 m are reproducible 19,21 . This may be a false assumption, as combining errors from multiple performance trials can result in a greater total error when calculating CS/D'. This is most likely when only two performance trials are used to make the protocol more practical, as one 'bad' test will have a greater impact on the CS/D' result than if averaging data from more trials.
To the author's knowledge the reliability 'form stroke' performance had only been examined up to competition distances of 200 m. The present study shows expanded on this, demonstrating that performances over 200 m and 400 m were highly reliable with CoV below 2% and ICC ≥ 0.90 for all strokes. Mean TEM were typically larger over 400-m performance trials (≤ 4.97 s) than 200-m trials (≤ 2.64s), but CoV were similar. The absolute reliability of pacing in this study was good and similar to that reported in previous research, with CoV of normalised velocity typically < 2% over each 50-m split. 26 It is still unknown how reliable 'form-stroke' performances over distances > 400 m would be, as may be advised for a scenario where maximising the validity of CS estimation is prioritised over the feasibility of completing longer trials.
It can therefore be expected that CS demonstrated very good but weaker absolute and relative reliability (CoV < 3.4%; ICC ≥ 0.70) than the 200-m and 400-m trials used for its calculation (Figure 1). CoV obtained in this study were also larger across all strokes (~2-3%) than in studies examining reliability of CS estimation based on single or repeated effort all-out CS protocols (~1%). 27,28 Only using two performance trials did not allow for the calculation of any error in CS estimation in this study, as a perfect linear d-t relationship is the only possible outcome, and increased the potential affect of one 'bad test'. Despite this, CS calculated using 200-m and 400-m performance trials is still deemed su ciently reliable to be used in practice for all swimming strokes in this study as the low CoV and high ICC values evidence strong absolute and moderate to excellent relative reliability of CS in a test-retest scenario. Absolute and relative reliability of D' (CoV ~13-45%; ICC -0.14-0.76) however are not good enough to be of practical use in any stroke. Despite poor absolute reliability, the D' of butter y swimmers did show good relative reliability. This may well be a function of greater between-subject variability from this sample in ating relative within-subject variability.
SWC values indicate that a CS change of 0.01-0.02 m.s -1 would be practically meaningful for performance in swimmers of all four strokes, while SDC ind values indicate that 0.07-0.12 m.s -1 would be required to identify a 'true' change in an individual. This means that swimmers could experience practically meaningful changes in their CS that would not be classi ed as a true change to a level of 95% con dence.
Including a third 800-m performance trial in the calculation of CS may reduce the level of random error and provide error estimate in CS and D' calculations, however such a decision would have to be traded off against the practicalities of integrating a third testing session into a swimmer's training plans. The present study shows that the inclusion of an 800-m effort is unpractical for butter y and likely challenging for breaststroke swimmers., The inclusion of a 300-m or 500-m performance trial may still enhance reliability. A threat to CS validity would however remain since the longest trial duration would still be much shorter than the 15-minutes recommendation 29 . It would be prudent for a coach to assess their individual swimmers' willingness to do a third performance trial of a given distance to ensure buy-in and long-term compliance to such a testing protocol. Two butter y swimmers for instance commented to the lead researcher before their rst trial that they were unsure of being able to nish a 400-m effort, while one butter y swimmer -a national record holder over 50-m -replied "DEFINITELY NOT" when asked whether they would swim an 800-m effort.

Limitations
This study was conducted using national and international standard swimmers. Its results may not be directly applicable to amateur swimmers or professional triathletes. Although this study is one of few to have such a high standard of athlete participate the low total number of athletes from each stroke may have led to Type 2 errors as a result of underpowered analysis. For this study to be practically feasible athletes could train in the 24 hours prior to testing sessions but were asked to keep these sessions at a low intensity. Training during these sessions was not controlled by the experimenter.
The SWC values reported in this study will be more conservative than that relevant to an applied setting of national or international racing because their calculation includes data from a mix of participant genders, race distance specialisations and relative abilities. In an actual race, these factors would likely be more homogenous and so the resulting SWC would be lesser. It is therefore suggested that a coach calculates the SWC that is most relevant to the individual swimmer they are working with where possible.

Practical Applications
The information provided in this study through TEM and SWC allows for coaches and practitioners to make inferences related to the likelihood of swimmers' having made practically meaningful changes in CS (Appendix 4). Coaches and practitioners may also wish to refer to Appendix 5 for reliability data contextualised into units of s.100m -1 , which will be more appropriate to use when working with individual athletes. The CS protocol examined in this study represents a practical method for assessing the aerobic capacity of national standard backstroke, breaststroke and front crawl swimmers. Importantly this protocol can be conducted with minimal need for specialist equipment or expertise making it highly practical in a swimming club setting. It is recommended that a coach or support staff hoping to use such a protocol rst ensure they understand the theory and get the buy-in of the swimmers they wish to use it with.

Conclusion
The individual 50-m performance splits that make up 200-m and 400-m performances demonstrate very good absolute reliability of pacing. A linear, two-parameter model using these two performance trials yields reliable CS but not D' in national level swimmers, for all four strokes. Coaches and practitioners need to recognise the need for a balance between optimising validity and practical feasibility of CS protocol and how this may differ across strokes. Including ≥800-m performance trials would not be practically feasible for butter y swimmers and might be challenging with some breaststroke swimmers.
Declarations Figure 2 Pacing patterns data during performance trials 1 (solid line), 2 (dashed line) and 3 (dotted line represented as mean ± standard deviation, calculated from split times relative to mean velocity) and coe cient of variation from trials 1-2 (white bars), 2-3 (grey bars) and overall (black bars) from all strokes and distances Percentage of participants who responded "yes" when asked whether they thought (A) they could have completed an 800-m trial (B) they would have wanted to complete an 800-m trial. * denotes difference from butter y swimmers, # denotes overall difference across strokes (p<0.05).

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.