We empirically tested these different steps for artifact correction using simulated photometry data. We developed MATLAB scripts to simulate key components of photometry data and their influence on each other (all data and MATLAB code available at https://github.com/philjrdb/RegressionSim). Simulations were used because the true parameters (e.g., underlying neural dynamics) are known, unlike real photometry data where underlying neural dynamics are technically unknown.

The components of the simulation were, in order of their operation (**Figure 3A**):

- A neural dynamic component composed of 100 identical 3sec event transients (waveform taken from
[10]), which was tracked by experimental but not isosbestic signals.
- A photobleaching component, modelled as a double-exponential decay function
[4] that attenuated experimental and isosbestic signals.
- A movement-related component that stochastically attenuated both experimental and isosbestic signals by a shared percentage.
- A unique Gaussian noise component per experimental and isosbestic signal.

The order of these operations was designed to emulate artifact-related transformations of experimental and isosbestic signals across key junctures of the recording set-up (recording site → photodetector → signal processor). Neural and photobleaching components were fixed, whereas movement-related and noise components were stochastic and randomized between simulations. These simulations were used to assess the impact of 3 aforementioned choices made by researchers when obtaining an “artifact-corrected dF/F” from photometry signals: 1) low-pass filtering signals before isosbestic fitting, 2) applying OLS regression for isosbestic fitting, and 3) using the fitted isosbestic to perform baseline normalization across the session.

Specifically, we generated 10 simulations of experimental and isosbestic signals for a 20min recording session (10Hz sampling rates). These signals were then either low-pass filtered (3Hz) or not. Isosbestic signals (filtered or unfiltered) were then fit to corresponding experimental signals using OLS or IRLS (**Figure 3B**). IRLS regression was performed by the in-built MATLAB function *robustfit*, using Tukey’s bisquare as the weighting function. This function is favorable as it entirely ignores large outliers (putative neural dynamics), unlike other weighting functions that only attenuate the influence of large outliers, allowing IRLS to better concentrate the fit onto the artifactual baseline. The narrowness of Tukey’s bisquare is determined by a tuning constant. The default constant is 4.685, but two smaller constants (c = 3 vs 1.4) that make the weighting function narrower (i.e., more aggressive downweighting) were also tested. Larger constants were not tested as this shifts IRLS towards an OLS solution. Finally, fitted isosbestic signals were applied in dF/F ([experimental signal - fitted control]/fitted control) and dF (experimental signal - fitted control) calculations, reflecting the disparate choice to apply vs. not apply baseline normalization using the fitted isosbestic signal. In summary, there were 3 independent factors tested for each of the 10 simulations conducted: low-pass filter [3Hz vs none] x regression [OLS vs IRLS c=4.685 vs 3 vs 1.4] x baseline correction [dF vs dF/F].

The key dependent variable was how much extracted “artifact-corrected” signals (dF or dF/F) reflected the actual underlying neural dynamic signal. Given true and extracted signals are on different scales due to artifact attenuation factors as well as dF/F by definition being a fraction of dF, we normalized true and artifact-corrected signals by dividing whole-session signals by their sum squared deviation from 0 (i.e., they were z-scored relative to 0 [null-Z]; REF). The average absolute residual (i.e., difference) between normalized true and extracted signals were calculated for baseline (non-event) versus event periods (**Figure 3C**). Repeated measures ANOVA of residuals across simulations (*n* = 10) revealed each analysis factor had a significant effect on the accuracy of extracted signals for both baseline and event periods. These effects were essentially additive, so we discuss them in turn.

Applying a 3Hz low-pass filter to experimental and isosbestic signals resulted in a more accurate artifact-corrected signal (baseline periods: *F*(1,9) = 12137.4, p < .001; event periods: *F*(1,9) = 2665.0, p < .001) (**Figure 3C**). As theorized, IRLS regression produced a more accurate artifact-corrected signal than OLS regression, with this effect scaling with the tuning constant (baseline periods: *F*(3,27) = 37284.5, p < .001; event periods: *F*(3,27) = 8608.8, p < .001). That is, more aggressive downweighting reliably resulted in better accuracy across parameters (baseline periods: all *t*(9)* *> |14.9|, p < .001; event periods: all *t*(9)* *> |10.5|, p < .001[Bonferroni-adjusted]). Baseline normalization via a dF/F calculation was preferable to its omission via dF calculation (baseline periods: *F*(1,9) = 72.80, p < .001; event periods: *F*(1,9) = 308.4, p < .001).

Causes for these differences are clear when examining true vs extracted peri-event signals (**Figure 3D**). The true underlying event transient was a 3sec waveform that was identical in shape and amplitude across the 100 event trials, each flanked with a null baseline of 0. OLS vs IRLS regression and dF vs dF/F calculations had distinct impacts on the degree to which extracted peri-event signals reflected this.

dF vs dF/F calculations had a clear impact on event transient extraction over trials (**Figure 3D**). Despite the true transient not changing, there was an obvious trend across trials for dF-derived event signals. dF-based transients from initial trials were substantially larger than those from later trials, regardless of whether they were OLS- or IRLS-derived, due to signal attenuation from simulated photobleaching. By contrast, baseline normalization via the dF/F calculation mitigated this influence of photobleaching, largely restoring the stability of transients across the session.

OLS vs IRLS regression had an orthogonal effect on extracted peri-event signals, instead influencing the average waveform (**Figure 3D**). Mean peri-event signals across simulations (*n* = 10) were analyzed using waveform confidence intervals (95% t-test CI, 1/3sec temporal threshold [10]) against the null of 0, as commonly done in the literature [11-15]. Analysis of the true peri-event signal confirmed the 3sec waveform period as significantly deviating from 0, whereas pre- and post-event periods (null baseline) did not. When extracting OLS-derived “artifact-corrected” signals, mean waveforms (both dF and dF/F) were downshifted relative to the true signal (also evident in **Figure 3B**), resulting in a misidentification of when and how the peri-event waveform deviated from the null of 0. OLS-derived transients failed to identify the full 3sec extent of the positive waveform (Type 2 error) while pre-, late and post- event periods were misidentified as being below 0 (Type 1 error).

Taken at face value, these results would promote misconstrued inferences about peri-event neural dynamics. Alternatively, this reliable below 0 “inhibition” prior to the event of interest might be used as indication of an invalid baseline, which is often combatted by re-baselining each trial to a manually selected pre-event period. This extra arbitrary analysis step comes with the assumption that the designated pre-event period acts as a valid baseline, i.e., is free of neural dynamics that might systematically or unsystematically skew trial data (including a potentially real pre-event inhibition). This can be problematic for tasks in which animal behavior and potential anticipatory neural dynamics pervade the recording session (e.g., free operant tasks [12]), precluding a valid one-size-fits-all pre-event baseline. It is worth emphasizing that invalid pre-event baselining is equally liable to create the Type 1 and Type 2 errors observed here with OLS incorrectly baselining “artifact-corrected” signals.

IRLS-derived dF/F theoretically bypasses these issues by more directly fitting the isosbestic to the artifactual baseline of the experimental signal. Accordingly, the IRLS-derived transients (both dF and dF/F) accurately identified the 3secs transient from null pre/post periods without the need for re-baselining or trial-by-trial based corrections. This provides empirical support for the theoretical preferability of IRLS over OLS.

In summary, these simulations indicate low-pass filtering signals, applying IRLS regression to fit isosbestic to experimental signals, and using these signals within a baseline-normalizing dF/F calculation, produces better artifact-corrected neural dynamic signals than not low-pass filtering, using OLS regression, and not baseline-normalizing (i.e., only subtracting fitted isosbestic from experimental signals). Although these specific results are demonstrated with a particular set of simulation parameters, their logical basis applies across photometry scenarios, and we have found these results hold across simulation parameters. We have made a wide range of parameters accessible in the simulation scripts provided at https://github.com/philjrdb/RegressionSim; we welcome further tests and refinements.