Robust segmented regression: application to oxygen uptake plateau identification

Recently, segmented regression has been utilized as a “working” model for a bootstrap test to detect true oxygen uptake plateau. This approach employs an iterative procedure based on least squares to fit the model. However, it is widely acknowledged that the least squares approach is highly sensitive to outliers, often yielding inefficient estimates. This paper proposes an alternative iterative method by substituting the least squares step with a M-estimator step. Leveraging the robust features of M-estimators, the proposed method is expected to exhibit resilience against outliers. To empirically investigate the performance of the proposed method, a Monte Carlo simulation study is conducted. For comparison purposes, the classical iterative method is also considered. The results indicate that, in uncontaminated scenarios, both methods exhibit similar behavior, with the classical method presenting a slightly superior performance. However, in contaminated cases, the classical method is highly deteriorated, with significant bias and root mean squared error, while the proposed method demonstrates much better performance. Both methods are employed to fit the “working” 3-segment regression model and perform a plateau test on apparently contaminated oxygen uptake data. The results reveal that, due to the presence of outliers, the classical method produces inflated estimates for the error variance and larger standard errors for all parameters, in comparison to the proposed method. Moreover, the plateau test conducted using the classical (robust) method leads to the rejection (confirmation) of a plateau of oxygen uptake for the same data. Overall, these findings highlight the superiority of the proposed method in handling outliers.


Introduction
In exercise physiology, oxygen uptake ( VO 2 ) is the measurement of the pulmo- nary volume of oxygen consumed per unit of time to produce adenosine triphosphate, the energy source for muscle activities.The maximum VO 2 ( VO 2max ) is defined as the highest VO 2 a given individual can achieve in an exercise despite the increment in the power or velocity.VO 2max represents the integrated capacity of the pulmonary, cardiovascular, and muscular systems to uptake, transport, and utilize oxygen, respectively (Poole et al. 2008).Knowing VO 2max is important as it indicates the level of cardiorespiratory fitness.Even more, a proper assessment of the VO 2max occurrence or not could improve further studies about the physi- ological limitation of the VO 2 and possibly the training program prescription.In laboratory settings, VO 2max is commonly measured at the lung level during incre- mental exercise to exhaustion on a treadmill (or cycle ergometer), where expired air is analyzed by gas analyzers (Astorino et al. 2005).
Specifically, in ramp test protocols that start from a resting phase, VO 2 is eval- uated by gradually increasing the exercise load (treadmill speed) at a constant rate in equidistant time intervals (t), and the observations are collected breath-bybreath.According to Rossiter (2010) and Keir et al. (2018), usually, the t × VO 2 dynamics produced in this protocol (within moderate domain) initially exhibit an exponential behavior (faster VO 2 kinetics) due to the transition from resting to exercise (Grassi 2001).In response to subsequent increments in exercise load, a second exponential increase in VO 2 occurs (slow component of VO 2 within the heavy domain) (Xu and Rhodes 1999), followed by a linear increase until the end of the test.However, sometimes a plateau of VO 2 ( VO 2plateau ) may occur, indicat- ing a limitation in the body's ability to further increase oxygen uptake.This plateau is characterized by a "constant" VO 2 despite an increase in exercise load.In other words, VO 2plateau can be defined as a flattening in the relationship between VO 2 -load (power or velocity).In literature, the occurrence of VO 2plateau is com- monly used to confirm the attainment of VO 2max (Astorino et al. 2005).Figure 1 presents the VO 2 dynamics from a single runner.The left panel (Fig. 1a) displays the breath-by-breath data, while the right panel (Fig. 1b) shows a 30-s moving average.
Figure 1a clearly shows the presence of at least two outlier observations.From this figure, it is not clear whether a VO 2plateau is present.The moving average pre- sented in Fig. 1b helps reduce noise and appears to indicate a subtle VO 2plateau at the end of the test.However, visually, this could be purely speculative.Therefore, data treatment has direct influence over VO 2plateau occurrence and type I and II errors.Even more, the threshold of VO 2 < 2.1 mL/kg/min generally used and accepted in literature to declare a plateau is subjective and does consider individual aspects of the expected relationship between VO 2 -load.

3
Environmental and Ecological Statistics (2023) 30:625-648 Identifying VO 2plateau is crucial in the field of Exercise Physiology, and many studies have proposed different methods for this purpose.Recently, Cândido et al. (2019) and Patricio et al. (2021) have utilized segmented regression as an approximation to the nonlinear behavior of VO 2 dynamics and proposed a procedure for identifying VO 2plateau based on this "working" model.The segmented model is a piecewise linear continuous model, and when the changepoints are unknown, the log-likelihood becomes piecewise differentiable, leading to violations of standard regularity conditions (Feder et al. 1975;Feder 1975, for example).Moreover, the changepoints are nonlinear parameters, and standard maximization procedures can be highly unstable.To overcome these limitations, Muggeo (2003) proposed an iterative procedure to estimate the parameters of the segmented model.Some studies have extended this method to other scenarios (Muggeo 2008;Muggeo et al. 2014, for example).
However, Fig. 1 displays some outlier observations that can have detrimental effects on these classical procedures.This issue motivates us to propose and investigate an estimation method for the segmented regression model that is robust against outlier observations.In this context, successful robust methods could enhance the procedures outlined in Cândido et al. (2019) and Patricio et al. (2021).
Robust methods have been extensively investigated and applied in various contexts, including regression (Rousseeuw et al. 2004;Serneels et al. 2005) and time series analysis (Sarnaglia et al. 2010(Sarnaglia et al. , 2021b;;Solci et al. 2020).In the specific case of segmented regression, robust techniques have also been explored.For instance, Zhang and Li (2017) and Shi et al. (2020) introduced rank-based estimation M-estimators, in particular, are widely known for their robustness against outliers and have been successfully applied in various scenarios (e.g., Fajardo et al. 2018;Sarnaglia et al. 2021b).In essence, M-estimators draw inspiration from, but are not limited to, maximum likelihood estimators (MLE) based on non-Gaussian distributions, often heavy-tailed.They are characterized by replacing the quadratic loss function associated with the least squares method and the Gaussian MLE with alternative loss functions that assign lower weights to gross errors.
In literature, the method introduced by Muggeo ( 2003) is widely used to fit segmented regression.It involves employing a Taylor expansion to construct auxiliary covariates and approximating the segmented regression model through a standard multiple linear regression model.Essentially, the method can be outlined as follows: (1) start with initial change-point estimates; (2) in iteration s ≥ 0 , compute the "working" covariates depending on the current change-point estimates; (3) fit the auxiliary multiple linear regression using least squares estimator; (4) use the coefficient estimates to update the change-point estimates; (5) stop the procedure if a convergence criterion is attained, otherwise make s = s + 1 and return to step (2).In this paper, we propose an alternative version of the iterative method introduced by Muggeo (2003) to enhance the robustness of segmented regression.Specifically, we replace the least squares estimator (Step (3)) in the previous procedure with an M-estimator, harnessing the advantageous properties of robust estimation.More details are given in Algorithms 1 and 2, along with relevant discussions later in the paper.
The efficiency and robustness of the proposed method are evaluated through an extensive Monte Carlo simulation study.The usefulness of the method is illustrated through an application to VO 2 data.The simulation results demonstrate that the robust methodology provides a viable alternative for parameter estimation in the model, even when dealing with small sample sizes.The estimations and computations were written in R version 4.3.1 (R Core Team 2023) by the authors and are available upon request.
Particularly in contaminated scenarios, the robust methodology outperforms traditional approaches in terms of bias and root mean squared error (RMSE).The empirical results suggest that Muggeo's estimator is not suitable for scenarios where the error structure includes heavy-tailed distributions.The application reveals that the presence outliers may impact the conclusion about the occurrence of a VO 2plateau .
The usefulness of the method proposed in this paper is not limited to VO 2 data, since segmented regression finds applications in a wide range of fields, particularly in the modeling of environment and ecological data.For instance, in recent studies, Sarnaglia et al. (2021a) employed segmented regression to estimate the critical threshold of air pollution to affect infant hospital admissions and Knipfer et al. (2020) utilized segmented regression to predict stomatal closure and turgor loss in woody plants.In both instances, the presence 1 3 Environmental and Ecological Statistics (2023) 30:625-648 of discrepant observations may deteriorate inferential process.Consequently, employing robust segmented regression presents itself as an appealing alternative.
The rest of the paper is structured as follows: in Sect.2, we recall the segmented regression model and the iterative method of Muggeo (2003), we also introduce the proposed robust method; in Sect.3, we present the results of the Monte Carlo simulation study; in Sect.4, we discuss the application to VO 2 data; in Sect.5, we give the final remarks.

Segmented regression
In multiple linear regression, one tries to account for a linear effect of a set of explanatory variables (or covariates) ⃗ X � = (X 1 , X 2 , … , X p ) on a response variable Y.The infer- ence is performed by observing the sample (Y i , ⃗ X � i ) , i = 1, … , n , and fitting the model where 0 denotes the intercept, � = ( 1 , … , p ) represents the coefficient vector and i is the random error term.One of the classic assumptions is that i ∼ N(0, 2 ) .Usually, estimation of model in Eq. 1 is performed by (Gaussian) maximum likelihood.
It is unquestionable the importance of multiple linear regression with applications broadly found in literature.However, in some cases, linearity is an unfeasible restriction, that is, the true effect of explanatory variables on response variable is nonlinear.In these scenarios, model in Eq. 1 becomes inappropriate.
For simplicity, without loss of generality, from now on, we shall consider only one covariate, Z, with nonlinear effect on the response variable.Now, for the sample , n , the model can be written as where is the coefficient describing the linear impact of Z on Y and h(Z, ) , � = ( 1 , … , r ) is a coefficient vector corresponding to the r-valued function describing the nonlinear effects of Z on Y, which is governed by a parameter vector � = ( 1 , … , q ) .Depending on the form of h, estimation of model in Eq. 2 may rely on unstable nonlinear optimization methods.
One particular nonlinear behavior is observed when the effect of one or more covariates are only piecewise linear, that is, the linear relation changes according to the covariate value.In this paper, we consider the piecewise linear relation continuous at the change points.This may be obtained by letting r = q and (1) (2) where (z) + = z1(z > 0) .This results in the segmented regression model with q changepoints (or q + 1 segments) given by where, now, the k represents the kth changepoint and, in this case, the slope of the 1st segment is given by , the slope of the 2nd segment is given by + 1 , the slope of the 3rd segment is given by + 1 + 2 and so on.Therefore, k represents the difference between slopes of segments k + 1 and k.See in Fig. 2 one illustration of this behavior with one segmented covariate with two changepoints and no linear covariate.More segmented covariates may be considered in the model by simply adding in Eq. 4 more continuous piecewise linear terms (Eq. 3) for other explanatory variables.
Maximum likelihood estimation of the segmented model in Eq. 4 is quite difficult since it consists of a nonlinear optimization problem, which requires potential unstable numerical procedures.The seminal work in Muggeo (2003) introduces an iterative method to circumvent this problem.The method consists in approximating each element of the continuous piecewise linear term h(Z, ) (Eq. 3).Specifically, by noticing that  z (z) + = 1(z > 0) , we conclude that the first order Taylor approxima- tion of the kth element of h around * k is given by where * k is a known value, preferred close to k .The approximated relation in Eq. 5 is used in Eq. 4, providing the approximated model where Since * k is fixed, the approximated model in Eq. 6 is a multiple linear regression model (Eq. 1) and may be estimated by standard maximum likelihood, for example.
The iterative estimation procedure starts with an initial guess to k , say φ(0) k , and, in the iteration s ≥ 0 , to proceed as follows: (1) create the working covariates (2) estimate the approximated model in Eq. 6 with U (s)  ik and V (s) ik replacing U * ik and V * ik , respectively; (3) use the estimated coefficients to update the k estimate by ( 4) take s = s + 1 ; (5) if no convergence criterion is achieved return to step (1), oth- erwise stop the iterative process and save the estimates.
Many convergence criteria may be chosen.For example, one may opt for declaring convergence when the maximum update in the changepoints satisfies and/or the minimum p-value for γ (s) k satisfies for some appropriate thresholds and * .A pseudocode to perform the iterative estimation method is presented in Algorithm 1. ( 6) Another problem with maximum likelihood is that the log-likelihood is only piecewise differentiable, which violates standard assumption of maximum likelihood estimation.This may lead to the hessian to provide unreliable standard errors estimates.One can solve this problem by using the iterative procedure described above in a bootstrap framework, for example.
As an important remark, for the VO 2plateau identification, the strategies proposed in Cândido et al. (2019), Patricio et al. (2021) considers two segmented models: one with two segments (describing absence of plateau); and one with three segments and third slope constrained as zero (describing the occurrence of a plateau).The estimation method presented above can not be directly applied to fit the constrained 3-segment model.For a segmented model with last slope constrained as zero, we have Therefore, the approximated model in Eq. 6 would be (8) Environmental and Ecological Statistics (2023) 30:625-648 where The iterative procedure is similar, but now, we must fit the approximated constrained model (Eq.8) and update the changepoints by where we define β(s) It is well-known that discrepant observations may have a high influence on inference.Therefore, robust estimators have been studied in literature to prevent this potential deleterious effect.In the next subsection, we briefly discuss the class of M-estimators for regression, or M-regression.

M-estimators
Before we introduce the robust method proposed in this paper, we will revisit some of aspects of M-estimators.The M-estimators are known for being flexible and for being able to be applied in complex problems (Stuart 2011).In particular, M-estimators can be considered as a robust alternative to Gaussian maximum likelihood estimators in many frameworks, such as the linear regression model, for example.In this paper, the linear regression model fitted by M-estimators might be referred to as M-regression.
In order to introduce the M-regression concept, let us first revisit the Least Squares Estimator (LSE) for linear regression model.Consider the model in Eq. 1, let � = ( 0 , � ) be the parameter vector and ⃗ T � i = (1, ⃗ X � i ) be the intercept/covariates vector, the LSE of is defined as By differentiating LSS( ) wrt , the solution of the minimization above is turned in to the problem of finding the solution to the systems of linear equations In the context of segmented regres- sion, for the approximated model in Eq. 6, we have that ⃗ As can be seen, the procedure in Eq. 9 consists in minimizing a quadratic loss e 2 , specifically It is well-known that the quadratic loss assigns too much weight to extreme values.Thus, in the presence of discrepant observations, LSS (or equivalently Gaussian likelihood) usually is highly influenced.
In a few words, the M-estimator consists in replacing the quadratic loss e 2 by other loss function (e) .In practice, we seek robustness by choosing (e) which assigns lower weights for great values than the quadratic loss.Once the loss (e) is chosen, the M-estimator for the regression is defined as Formally, it is standard to assume that the loss function (or -function) denotes a continuous function (e) such that: (e) ≥ (e * ) , if |e| > |e * | ; and (0) = 0 .Usu- ally, but not necessarily, it is also required symmetry of -functions, such that (−e) = (e) .Similarly to the least squares, when the -function is also differenti- able and (e) = e , we may replace the optimization in Eq. 10 by the problem of solving the system where the weights are The M-estimators might be defined from the M-estimating equations (Eq.11) by choosing some function (e) , referred to as -function, which need not necessar- ily be the derivative of any -function.The -function must satisfy: (|e|) > 0 and (−|e|) < 0 ; and (0) = 0 .Usually, but not necessarily, it is also required that (−e) = − (e) , that is an odd -function.If (e) = e for some -function, the last assumption is equivalent to symmetry of (e).
If the weights W 1 , … , W n were known, Eq. 11 would have the solution (11)

3
Environmental and Ecological Statistics (2023) 30:625-648 Unfortunately, even if we assume known, the weights W 1 , … , W n still depend on , which prevents the use of the exact solution in Eq. 13.In order to circumvent this problem, one usually relies on the Iteratively Reweighted Least Squares (IRLS).Basically, by assuming a known value for , the IRLS procedure starts with an initial guess θ(0) and, in iteration s ≥ 0 , we proceed as follows: (1) compute the weights W (s)  i by replacing by θ(s) in Eq. 12; (2) compute the estimate θ(s+1) by replacing W i by W (s)  i in Eq. 13; (3) take s = s + 1 ; (4) if no convergence criterion is achieved, return to step (1), otherwise stop the IRLS.
Of course, since the scale parameter is unknown, it must be estimated.Joint estimation of and the scale parameter by the procedure in Eq. 10 may be unstable and would require high computational cost.A common alternative is to use the residual Median Absolute Deviation (MAD), where the residuals are obtained from a preliminary estimate, for instance the least absolute deviation regression (Dodge 2008, pp. 299-302, for example).Let θ * be the preliminary estimate, define the preliminary residuals as .For Gaussian data, = 1 Φ −1 (3∕4) ≈ 1.4826 , where Φ −1 (⋅) represents the reciprocal of the quantile func- tion.Usually, we perform the IRLS using θ(0) as the preliminary estimate θ * to com- pute the MAD estimate σ , which subsequently replaces in computation of the the weights W 1 , … , W n in all iterations of IRLS.Another alternative is to update the MAD estimator of by incorporating the updated residuals in each iteration of IRLS.More details on M-regression might be found in Maronna et al. (2019), for example.

Robust segmented regression
We can now describe the robust method proposed in this paper.Here, we suggest using M-estimators to achieve robustness in fitting the segmented regression model.The proposed procedure is essentially the same as described in Subsect.2.1, with the linear regression estimation step being replaced with an M-regression step.
Specifically, we suggest to start with an initial guess to k , say φ(0) k , and, in the iteration s ≥ 0 , to proceed as follows: (1) create the working covariates in Eq. 7; (2) use M-estimators to fit the approximated model in Eq. 6 with U (s)  ik and V (s) ik replacing U * ik and V * ik , respectively; (3) use the estimated coefficients to update the k estimate by (4) take s = s + 1 ; (5) if no convergence criterion is achieved return to step (1), oth- erwise stop the iterative process and save the estimates.A pseudocode to perform the M-estimation of the segmented regression model is presented in Algorithm 2.
In the Step 6 of Algorithm 2, we suggest to use the IRLS method previously described in this section.It is important to emphasize that the IRLS is also an iterative procedure that has its own convergence criteria.
The choice of the -function or the -function can have an impact on the robustness and relative efficiency in Gaussian scenarios.Popular options include the Huber (Huber 1992), Hampel (Hampel et al. 1986), and Tukey's Biweight (Beaton and Tukey 1974) functions.The Huber -function is unbounded, which means it only attenuates the effect of outliers, allowing gross errors to still have a substantial impact on the estimates.On the other hand, the Hampel and Bisquare Environmental and Ecological Statistics (2023) 30:625-648 -functions are bounded, completely ignoring large errors.This makes the latter two M-estimators more robust against large discrepant observations.The boundness of the -function implies non-monotonicity of the associated -function, if it exists.Specifically, in this case, we have (e) → 0 as |e| → ∞ .These and functions lead to what are known as redescending M-estimators.While redescending M-estimators have better robustness properties, the M-estimating equations (Eq.11) may have multiple zeroes, indicating local minima in Eq. 10.Therefore, when using redescending M-estimators, it is recommended to start with a good initial guess for the estimates, typically the least absolute deviation is sufficient.
In this paper, the standard errors of the M-estimators will be accessed via Bootstrap.The procedure is structured as follows: (1) Fit the segmented regression model to the original data using one M-estimator; (2) obtain the residuals for the original fit; (3) draw a random sample with replacement from the residuals obtained in (2); (4) generate a bootstrap resample from the response variable using Eq. 4 with the original covariates, and the parameters replaced by the estimates obtained from the original data and the errors replaced by the bootstrap residuals drawn in (3); (5) obtain bootstrap estimates by fitting the segmented regression model to the bootstrap resample with the same M-estimator; (6) store the bootstrap M-estimates and return to (3) until achieve the required size for the bootstrap distribution for the M-estimators; (7) use the bootstrap distribution to compute estimates of the standard errors.

Empirical study
In this section, we present the results of Monte Carlo experiments conducted to evaluate the performance of the proposed estimation procedure in both contaminated and uncontaminated scenarios.We also compare the performance of the suggested method with Muggeo's method (2003) in these scenarios.We consider four different error structures: (1) ∼ N(0, 1) ; (2) ∼ t (4) (Student's t-distribution with four degrees of freedom); (3) ∼ 0.9 N(0, 1) + 0.1 t (4) ; (4) ∼ 0.9 N(0, 1) + 0.1 Cauchy(0, 1) .For each experiment, we conduct 1000 replications with sample sizes of n = 400, 600 , and 800.We consider q = 2 change points, and the regression coefficients are set as follows: ( 0 , , 1 , 2 , 1 , 2 ) � = (11, 25, −21, −4, 1.0, 12 × p) � , where p = 0.8 and 0.9.We have considered a three-segment model with the last slope fixed at zero, representing a model with a plateau.It is important to note that the specific parameter values do not affect the technical relevance of the proposed method.We have explored different values, and the conclusions remain consistent.These parameter values are commonly observed in Exercise Physiology scenarios [e.g., Patricio et al. (2021)].M-estimator methodology is considered a reasonable alternative to reduce the effect caused by the existence of extreme values in the data set.We consider the following loss functions: All codes are written and executed in the R language (R Core Team 2023) with some additional packages.
Tables 1, 2, 3, 4, 5, and 6 present the empirical results of the Monte Carlo experiments conducted for the four scenarios considered in this study.Tables 1,  2, and 3 display the results for p = 0.8 , while Tables 4, 5, and 6 display the results for p = 0.9 .Each table presents the bias and root mean square error (RMSE) of the parameter estimates for the different sample sizes investigated.
In the uncontaminated scenario (Case 1), Muggeo's estimator demonstrates excellent performance and is among the best estimators for parameter estimation in segmented regression models without contamination.However, even in this scenario, the robust methodology proposed in this study proves to be a viable alternative for parameter estimation, even with small sample sizes.
In the contaminated scenarios (Cases 2-4), Muggeo's estimator exhibits a substantial increase in the RMSE of the parameter estimates, indicating that it is not suitable for these types of scenarios.This behavior persists across all sample sizes examined.Therefore, based on the empirical results, it can be concluded that Muggeo's estimator is not appropriate when the error structure includes heavy-tailed distributions.
The results presented in Tables 4, 5, and 6 consistently support the same conclusions, indicating no significant changes in the performance of the estimator when considering different change points to achieve the plateau effect.In contaminated scenarios, the robust methodology stands out, demonstrating lower bias and RMSE values.
Environmental and Ecological Statistics (2023) 30:625-648 As anticipated, the empirical results show a slight decrease in the contamination effect on the parameter estimates of the model as the sample size increases.It is important to note that although larger sample sizes lead to apparent reductions in the contamination effect, it does not eliminate the detrimental impact entirely.However, the performance of the M-estimator improves significantly with larger sample sizes.It is worth mentioning that in the empirical experiments, the parameter estimates 1 and 2 were found to be less sensitive to contamination across all estimators considered.
One advantage of M-estimators lies in their flexibility to choose the loss function, allowing for better adaptation in contaminated scenarios.The results did not reveal significant differences in the performance of the M-estimator when employing different loss functions.In other words, the M-estimator demonstrated similar performance across various scenarios, regardless of the specific loss function used.However, the Hampel loss function exhibited a slight advantage by generally yielding lower RMSE values in contaminated scenarios.These empirical findings underscore

Application
In order to illustrate the usefulness of the proposed method, we fit a segmented regression to the VO 2 data previously displayed in Fig. 1.Following Cândido et al.  (2019) and Patricio et al. (2021), we consider the constrained 3-segment "working" model where we have imposed the constrain 2 = −( + 1 ) .The model above was fit- ted using Muggeo's method and the proposed robust methodology with the Huber, Hampel and Tukey losses.The standard errors were accessed through the bootstrap procedure described at the end of Subsect.2.2 with 1999 bootstrap resamples.The results are presented in Table 7.
As can be seen, in terms of point estimates, the atypical observations have more prominent effect on error variance 2 , which is quite inflated for Muggeo's method.The other parameter estimates did not present such a dramatic difference, however we note that the 2 estimate was increased for Muggeo's method, which may be view as less evidence in favor of a plateau.The small impact on other parameter estimates might be explained by the position of the outliers at the original sample.On the other hand, the outliers inflate substantially the standard error for all parameters when estimated by Muggeo's method.This sensitivity might be explained by the fact that the outliers at the original sample lead to large residuals, which could generate outliers at the bootstrap resamples in different positions, these few outlying observations at the bootstrap resamples are not "ignored" by Muggeo's method, increasing uncertainty of estimates.In Fig. 3, we present the fitted models using all methods.As can be seen, at least visually, the fitted models are very similar, which is expected due to the findings in Table 7.We have applied Shapiro-Wilk's and Jarque-Bera's normality tests to the residuals and both tests reject normality for residuals produced by all methods.This may be due to the presence of large residuals corresponding to the outliers and a little left skewness of the residual distribution.However, we are confident the non-normality should not impact SE's presented in Table 7, since they were accessed through non-parametric bootstrap.
In order to investigate if the atypical observations would impact the decision about the presence of a plateau of VO 2 , we applied the bootstrap plateau test pro- posed in Patricio et al. (2021) with 1999 bootstrap resamples, using Muggeo's and the proposed method (with Huber, Hampel and Tukey losses) to estimate the "working" model at the original and bootstrap data.The p-values of the tests are given in Table 8.
As can be seen in Table 8, the discrepant observations affect the conclusion of the test.At a 5% significance level, for Muggeo's method the result shows that there is no evidence to declare a VO 2plateau , whilst for the robust method with all considered loss functions, the results show that there is evidence of a VO 2plateau .This contradiction caused by the outliers shows the importance of considering robust methods to avoid its deleterious effects.
It is worth to point out that we have performed the same analysis for other VO 2 data with no evident outliers.As expected from the simulation results, the estimates, the standard errors and the p-values of plateau tests present similar values for both Muggeo's and robust methods.

Final remarks
The study is motivated by an application on Exercise Physiology, specifically in the identification of occurrence or not of VO 2plateau .Knowing whether VO 2plateau has occurred is critical to diagnosing the weakest physiological system (i.e., limiting system) in an individual.Observing a VO 2plateau indicates that the cardiocirculatory system (central aspects) restricts human performance and quality of life.Moreover, when VO 2plateau is absent the human performance and functional movements are limited by peripheral aspects (neuromuscular system).Therefore, exercise training could be prescribed according to the physiological limitations system to minimize the problem detected.Many studies have proposed different methods for the identification of occurrence or not of VO 2plateau .Recently, Cândido et al. (2019) and Patricio et al. (2021) have utilized segmented regression as an approximation to the nonlinear behavior of VO 2 dynamics and proposed a procedure for identifying VO 2plateau .
This paper proposes a robust estimation approach for segmented regression models.The methodology is based on M-estimators.The performance of the proposed estimator is accessed through a Monte Carlo simulation study and, for comparison purposes, the iterative method proposed by Muggeo (2003) was also considered.
The empirical results of the Monte Carlo experiments provide valuable insights into the performance of different estimators in both contaminated and uncontaminated scenarios of the segmented regression model.In uncontaminated  scenarios, Muggeo's estimator stands out as one of the best performing methods, while the robust methodology shows promising results even for small sample sizes.However, in contaminated scenarios, Muggeo's estimator exhibits a significant increase in the RMSE of parameter estimates, indicating its unsuitability for such scenarios with heavy-tailed error structures.The empirical findings also emphasize the importance of sample size in mitigating the contamination effect, although it does not eliminate it entirely.Larger sample sizes lead to improved performance of the M-estimator, but contamination remains a factor to consider.The flexibility of M-estimators in choosing loss functions is advantageous, as it allows for better adaptation in contaminated scenarios.Although there is no significant distinction in the performance of the M-estimator when using different loss functions, the Hampel loss function demonstrates slightly lower RMSE values in contaminated scenarios.Overall, the proposed methodology shows promise in finite sample scenarios, with consistent performance of the M-estimator across different change points for the plateau effect.These empirical results contribute to a better understanding of parameter estimation in the segmented regression model, particularly in the presence of contamination, and provide valuable insights for researchers and practitioners working in related fields.
For illustration purposes, we use Muggeo's and the robust methods to fit a constrained 3-segment "working" model to VO 2 data with suspicious atypical observa- tions.The findings show that, as a consequence of outliers, Muggeo's method provided estimates with much larger standard errors than its robust counterparts and, in addition, the error variance estimate was dramatically inflated for the former.
Finally, to investigate the impact on VO 2plateau identification, we applied the VO 2plateau bootstrap test (Patricio et al. 2021) using both methods to estimate the "working" segmented model.The result shows that outliers are able to change the conclusion of the test, in this case, at a 5% significance level, using Muggeo's method, the test pointed absence of evidence to support the VO 2plateau hypothesis (p-value = 0.0760 ), whilst the robust method shown significance of the VO 2plateau hypothesis (p-values = 0.0110, 0.0055, 0.0015 for Huber, Hampel and Tukey losses, respectively).This shows the importance of considering robust methods in order to avoid deleterious effects of the eventual presence of discrepant observations.
As a future research topic, some of the authors of this paper are devoted to exploit regression models with segmented interaction to explain the effect of quantity of vaccines (and possible environmental covariates) on daily deaths by COVID-19, along with robust estimation methods using a similar approach proposed in this paper.

Fig. 3
Fig.3Fitted models for VO 2 (mL/kg/min) data for Muggeo's method and the robust method with Huber, Hampel and Tukey losses

Table 1
Empirical results for the case n = 400 and p = 0.8

Table 2
Empirical results for the case n = 600 , p = 0.8

Table 3
Empirical results for the case n = 800 , p = 0.8

Table 4
Empirical results for the case n = 400 , p = 0.9

Table 5
Empirical results for the case n = 600 , p = 0.9

Table 6
Empirical results for the case n = 800 , p = 0.9

Table 7
Parameter estimates and standard errors (SE) for the 3-segment constrained model for Muggeo's and the proposed methods using Huber, Hampel and Tukey losses

Table 8
Bootstrap p-values of the plateau identification test for Muggeo's and the proposed methods using Huber, Hampel and Tukey losses