Expert Opinion in the Design of a Motor Neurone Disease Diagnostic Study

doi:10.21203/rs.3.rs-2493177/v1

Background

Motor neurone disease (MND) is a rapidly progressing and rare neurodegenerative disorder characterized by progressive weakness, muscle wasting, and death from respiratory failure within 36 months of symptom onset. To date, clinical trials in MND have failed to identify therapeutic interventions that halt disease progression, possibly because the majority of patients are recruited to trials too late in the disease course. To recruit patients earlier, diagnostic criteria for MND now include evidence of subclinical disease in unaffected muscles, as assessed by needle electromyography (EMG). Whilst other electrodiagnostic tests of subclinical disease could be incorporated into these criteria alongside needle EMG, it is unclear whether this would provide additional diagnostic accuracy/certainty. Here we use beta-band intermuscular (EMG-EMG) coherence (BIMC) as an example of how this issue can be addressed with statistical confidence in future studies.

Methods

Using the BIMC test as a case study, we provide a statistical framework for the incorporation of expert knowledge into the choice of sample size using expert elicitation and Bayesian assurance calculations. Probability distributions were elicited from seven clinical experts and aggregated to form group consensus distributions.

Results

The Bayesian assurance calculations led to a smaller required sample size than traditional statistical power calculations. The quantification and incorporation of clinical expert knowledge and uncertainty in sample size calculations can provide better calibrated predictions of study outcomes and ensure the most appropriate sample size is chosen.

Clinical experts reported the sensitivity of the Awaji criteria in line with previous studies, providing evidence of the validity of the results. We note that multiple experts understated estimates of specificity compared to the literature, though this may be due to the format of the questions or the design of the case study.

Conclusions

Bayesian assurance can be used alongside expert elicitation to design diagnostic accuracy studies. While we focus on the BIMC test case study, the methods presented are relevant and can be applied to other emerging tests relevant to MND.

Sample Size Calculation

Bayesian Statistics

Assurance

Upper Motor Neuron Disease

Motor neurone disease (MND, also known as amyotrophic lateral sclerosis, ALS) is a late onset (average age 63) and relatively uncommon (incidence 2-4/100,000) neurodegenerative disorder characterised by the progressive loss of motor neurones from the brain, corticospinal tract (CST) neurones known as upper motor neurones (UMNs) and lower motor neurones (LMNs), from the spinal cord and brainstem. Clinically, this results in rapidly progressive muscle weakness and wasting, with sufferers typically dying 30-36 months after onset from respiratory failure. Sufferers accrue significant neurological disability and high levels of dependency over the disease course. Over the past 30 years, pre-clinical studies in rodent models of MND have identified several promising compounds with the potential to slow disease progression. Unfortunately, other than riluzole (Bensimon et al, 1994), which demonstrated only a very marginal survival benefit, and edaravone (Abe et al, 2017), which had a marginal effect on function but not survival, none of these compounds have shown efficacy in clinical trials. In the absence of any effective disease-modifying treatments for MND, high-quality multi-disciplinary team working is the current standard of care (Van Es et al, 2017).

The failure of the therapeutic pipeline has forced a reappraisal of the approach to both pre-clinical experimental work and clinical trials in MND. For clinical trials, one area of concern is patient recruitment. Whilst it is widely accepted that patients should be recruited to trials as early as possible, ideally at the point of diagnosis, early recruitment remains an issue. Diagnostic criteria were originally developed to ensure uniformity of case selection for clinical trials. However, more recent iterations have focussed on accelerating the time to diagnosis and thus recruitment to clinical trials. First codified in 1990 as the El Escorial Criteria (Brooks, 1994), subsequent iterations, namely the Airlie House Criteria (Miller et al, 1999), the Awaji Criteria (Nodera et al, 2007) and most recently the Gold Coast Criteria (Shefner et al, 2020), have increasingly incorporated laboratory evidence of subclinical LMN and UMN disease, intended to provide increased confidence in making a diagnosis in patients before clinical signs manifest. Measures of subclinical LMN disease are robust and widely available, and typically involve needle electromyography (EMG). By contrast, there are only very limited and specialised techniques for quantifying subclinical UMN disease.

Beta-band intermuscular (EMG-EMG) coherence (BIMC) is one potential means of assessing subclinical UMN disease. Neural oscillations in the 15-30Hz frequency range (beta-band) can be recorded from corticospinal tract neurons in the primary motor cortex, both invasively using microelectrodes (Baker et al, 1997), and non-invasively with magnetoencephalography (Kilner et al, 2000) or EEG (Baker & Baker, 2003; Riddle et al, 2004). Similarly, oscillatory discharges of spinal motoneurons manifest in surface electromyograms (EMG) recorded from distal upper and lower limb muscles (Conway et al, 1995). During sustained low force muscle contractions, both cortical and EMG oscillations emerge (Mima et al, 1999) and are significantly correlated with each other in the 15-30Hz (beta frequency) band; this can be quantified using coherence, a statistical measure of correlation between signals at a given frequency (corticomuscular coherence). A similar correlation is observed between EMG recordings from different muscles in the same limb (intermuscular coherence) (Baker et al, 1997; Baker & Baker, 2003; Jaiser et al, 2016). The functional coupling of beta-band oscillations in cortex and muscle (as demonstrated by corticomuscular and intermuscular coherence) is mediated by a circuit involving both the corticospinal tract (Fisher et al., 2012) as well as sensory afferent pathways (Kilner et al, 2004; Witham et al, 2011). Beta-band coherence could thus be used to assist in the diagnosis and monitoring of UMN (corticospinal) disease in neurodegenerative diseases, such as MND, where the sensory system is not involved, as suggested by several pilot studies (Fisher et al, 2012; Issa et al, 2017; Proudfoot et al, 2018).

1.1 BIMC Study Design

To investigate whether the addition of BIMC to standard needle EMG provides further diagnostic information in the investigation of patients with suspected MND/ALS, a cohort of patients referred to the Department of Clinical Neurophysiology at the Royal Victoria Infirmary (Newcastle upon Tyne, UK) for the investigation of painless muscle wasting and weakness will be tested prospectively.

BIMC measurements and needle EMG recordings for all patients will be made at the first visit to the department. Approximately half of these patients receive a diagnosis of MND/ALS, based on the results of needle EMG studies at the first visit. The remaining patients will receive repeat EMG studies in 6-month periods, with a diagnosis of MND/ALS likely to be made in around an additional 50% of those remaining.

Outcome data from this prospective study will make it possible to assess the diagnostic performance (sensitivity and specificity) of BIMC and needle EMG compared to needle EMG alone in a realistic referral population, and thus whether the addition of BIMC to standard needle EMG can establish a definite diagnosis of MND/ALS earlier than current diagnostic criteria. It will also confirm whether BIMC together with needle EMG can track subclinical disease progression. Finally, this dataset will provide a reference against which progression/survival curves from future interventional clinical trials can be validated.

1.2 Statistical Model

The statistical model used to calculate an appropriate sample size takes into account five parameters, as described in Table 1. Using conditional probability rules to combine these values, the probability of a patient with MND (as determined by the Awaji criteria at either time point) receiving an MND diagnosis via a positive BIMC test result at the first visit is given by

Table 1. Model Parameters

Parameter	Interpretation
Eta	Probability of a positive Awaji test result at the first time point
Phi	Probability of a positive Awaji test result at the second time point given a negative Awaji test result at the first time point
Theta 1	Probability of a positive BIMC test result at first time point given a positive Awaji test result at the first time point
Theta 2	Probability of a positive BIMC test result at first time point given a positive Awaji test result at the second time point
Theta 3	Probability of a positive BIMC test result at first time point given a negative Awaji test result for both time points

We focus on eliciting expert information about the anticipated performance of the BIMC test and the Awaji criteria. These elicitations took place in 2017, prior to the updated Gold Cost criteria (Shefner et al, 2020), and so will reflect the knowledge and understanding of the experts at that time.

2.1 Participants

A total of seven experts in Neurology and MND participated in the elicitations for the model parameters. Three experts were directly involved in the development of the BIMC test, and the design of the trial (Experts 5-7). Four further experts were recruited through referrals from the first three experts, as well as a request sent to the British Society for Clinical Neurophysiology (BSCN) mailing list (Experts 1-4). These experts were recruited solely for their opinion on the expected performance of the diagnostic tests.

Background information about each expert is provided in Table 2.

Table 2. Expert demographics

Expert ID	Job title	Background	Experience (Years)	Self-reported strengths and weaknesses
1	Neurology Registrar	Molecular genetics and physiology	5	Unfamiliar with BIMC prior to elicitation
2	Senior Clinical Lecturer & Hon Cons Clinical Neurophysiologist	Biomarkers research	5	Familiar with previous work on BIMC
3	Consultant	Clinical studies	10	-
4	Consultant Clinical Neurophysiologist		-	Some experience with BIMC
5	Consultant Neurologist/Clinical Neurophysiologist	Basic pathophysiology Clinical electrodiagnostics	8-10	Published literature and have personal clinical experience with BIMC test
6	Consultant Neurophysiologist	Research experience in upper motor neuron dysfunction in MND	10	Good background knowledge of MND/coherence and electrophysiology, but distant from front-line neurology care
7	Consultant Neurologist	Neuroscience research and clinical trials	25	Strong clinical experience, less up-to-date lab-based research

2.2 Expert Elicitation

Expert elicitation is the process of quantifying expert judgements, knowledge, and experience into probability distributions (Bojke, 2021). It is often used when specifying a prior distribution in Bayesian statistics and can be used in both the design and analysis of a trial.

Expert judgments from multiple experts can be combined to produce a distribution representative of a wider range of views. Studies suggest that these aggregated distributions tend to perform better than individual experts in terms of informativeness, which measures the amount of uncertainty represented by a distribution, and calibration, which measures how close the distribution is to the truth (Flandoli, 2011).

We aggregated expert judgements using two extensively validated methods. The Classical Method (CM) asks experts to make judgments on a series of seed questions, for which answers are known to the elicitors but not the experts. These are used to score and weight the experts in a mathematical aggregation (Cooke, 1988). Experts who perform better at probability specification in the seed questions receive higher weight when aggregating the judgements for the quantities of interest. The Sheffield Elicitation Framework (SHELF) provides a structure for a group of experts to discuss the quantities of interest and form an aggregated distribution themselves (O’Hagan, 2019). Experts individually respond to the elicitation questions, and the responses are shared anonymously with the group. Then, as a group, the experts determine the final set of judgments to represent what a rational impartial observer would conclude having heard all of the individual judgements.

2.3 Elicitation Implementation

Two rounds of elicitations were held. The first was carried out in March 2019 and included the three experts involved in the BIMC study. This in-person elicitation meeting used the SHELF format, involving both an individual and a group elicitation stage. The individual elicitations were completed online before the group meeting. The second elicitation took place over the course of 2020 and involved an updated version of the individual elicitation used during the first round. The documents used as part of the elicitation are provided in Appendix A.

During both rounds of elicitation, two types of quantity were elicited. The first type was the five parameters relating to the BIMC trial. The second was responses to seed questions, elicited as part of the CM. In this paper, we focus on the first type, with the second provided in the supplementary material. For both groups, the minimum, 25% quantile, median, 75% quantile, and maximum were elicited for each quantity of interest.

The SHELF aggregation combined the views of the three experts involved in the trial design (labelled Experts 5,6, and 7), while the CM aggregation combined views from all available experts.

Experts 1 and 4 only provided estimates for the parameters relating to the Awaji criteria’s performance. As such, their prior distributions could only be used for estimating values related to the Awaji criteria, and not the performance of BIMC or sample sizes.

2.4 Bayesian Sample Size Calculations

Assurance is the probability that a trial will result in a successful outcome and can be used analogously to statistical power in determining an appropriate sample size (O’Hagan, 2005). As a Bayesian method, assurance requires a prior distribution to represent the available knowledge prior to the trial. Mathematically, the assurance can be calculated as

$$Assurance= \int P\left(Successful Outcome \right| {\theta }\left) P\right({\theta })d\theta$$

where $P\left(Successful Outcome \right| {\theta })$ is the probability of a successful outcome (such as the null hypothesis being rejected), given a set of parameter values ${\theta }$, and P(${\theta })$ is the prior distribution for the parameters, representing the current state of knowledge.

In this case, the elicited values are used as a basis for defining the prior distribution. After eliciting the quantile values, a Beta distribution was fitted to each parameter for each expert.

Assurance calculations take into account the intended primary analysis of the trial. The analysis planned involved the use of McNemar’s test (McNemar, 1947). A Bayesian alternative to this test was constructed to allow for further comparison between power and assurance.

The log-ratio between two binomial distributions, ${X}_{1} \tilde Binomial({n}_{1}, {p}_{1})$ and ${X}_{2} \tilde Binomial({n}_{2}, {p}_{2})$, can be approximated by a normal distribution (Katz et al., 1978).

$$\text{log}\left(\frac{{X}_{1}}{{X}_{2}}\right)\tilde Normal\left(\frac{{p}_{1}}{{p}_{2}},\frac{{1-p}_{1}}{{n}_{1}}+\frac{{1-p}_{2}}{{n}_{2}}\right)$$

By setting X₁ to be the number of positive Awaji test results at the first time point, X₂ to

be the number of positive BIMC test results at the first time point, p₁ = η, and p₂ = ηθ1 + (1- η) φ θ 2 +(1- η)(1- φ) θ 3, the log-ratio then can be used to make inferences about the difference between the proportions of individuals diagnosed by the two tests.

Unlike statistical power, the maximum value assurance can take depends on the prior distribution. Assurance can be standardised across prior distributions by dividing by the maximum possible value, which rescales assurance to be between zero and one. This is referred to as scaled assurance (Alhussain & Oakley, 2020). The scaled assurance can be interpreted as the percentage of total assurance achievable. We focus on calculating scaled assurance to allow for comparisons across different prior distributions.

Summaries of the elicited distributions are provided in Fig. 1, alongside aggregated distributions calculated using the CM and SHELF, for each of the five quantities relating to the BIMC trial. While Experts 1 and 4 did not provide estimates for the parameters relating to the performance of the BIMC, their Awaji criteria distributions have been included both individually and within the CM.

There was consistency between experts across questions. The majority of experts, and aggregations, provided a distribution with a higher range of values for the proportion of patients initially diagnosed with MND using the Awaji criteria (labelled Eta), compared to those diagnosed using the Awaji criteria six months later (Phi). There was also a consensus that the proportion of patients who received a positive BIMC result, out of those diagnosed by Awaji, was equal to or greater at the first time point (Theta 1) compared to the second (Theta 2). The proportion of those who did not receive an Awaji criteria diagnosis, but did initially receive a positive BIMC test result, was lower than both of these (Theta 3).

The CM distributions tended to be centred similarly to the experts. They also had wide ranges, encompassing the majority of the possible values. The SHELF aggregation instead provided a much narrower distribution. Both aggregations led to distributions that overlapped with the majority of the experts, suggesting they contained reasonable and justifiable values.

Table 3 presents sensitivity and specificity values calculated based on the experts’ best estimates (their medians). As the table demonstrates, the majority of experts’ distributions implied an improvement in sensitivity with the addition of the BIMC to the Awaji criteria. The experts provide a wide range of estimated specificities for the Awaji criteria.

Table 3

Sensitivity and specificity values, calculated using expert best estimates (medians).
Expert/Aggregation	Sensitivity (Awaji)	Specificity (Awaji)	Sensitivity (Awaji + BIMC)	Specificity (Awaji + BIMC
1	0.8258	0.7500	-	-
2	0.8424	0.5000	0.6793	0.8000
3	0.9465	0.9400	1.0000	0.5745
4	0.9757	0.9000	-	-
5	0.7283	0.5500	1.0000	0.5200
6	0.7081	0.1800	0.7389	0.5700
7	0.8082	0.5000	1.0000	0.3500
CM	0.8202	0.4745	0.8650	0.5947
SHELF	0.7241	0.6500	0. 8110	0.4000

The results of the sample size calculations are presented in Table 4. To calculate power, medians from the prior distributions were taken as point estimates in a power calculation for McNemar’s test, with a required power of 80%. The assurance method used the model described in Section 2.4, with a target scaled assurance of 80%. For both the power and assurance calculations, the required sample size varied depending on the prior distribution used.

Table 4

Sample sizes to achieve power = 0.8 or scaled assurance = 80% for each expert and aggregation method.
Expert/Aggregation	Sample Size (Power)	Sample Size (Assurance)
1	-	-
2	22748	40
3	14495	37
4	-	-
5	36	14
6	1258	51
7	137	15
CM	10888	28
SHELF	25	6

The use of assurance has led to both smaller sample sizes compared to those based on power, and a narrower range of sample sizes across different prior distributions. While the power calculations take a single-point estimate into account, the assurance calculations incorporate a wider range of possible values. By including a range of input values which will lead to smaller sample size requirements, the assurance calculations, in this case, have led to smaller required sample sizes.

The maximum feasible sample size was suggested by the study designers to be 120. Table 5 provides the assurance values for a sample size of 120 using each expert- and aggregated-prior distribution. In terms of statistical power, Experts 2 and 3 provided priors which suggest around an equal chance the study would succeed or not. Expert 6 provided priors suggesting a lower chance of success, while Experts 5 and 7 provided the most optimistic prior distributions. All prior distributions led to scaled assurances greater than 90%, suggesting that increasing the sample size further would not be able to improve the assurance much further.

Table 5

Calculated scaled assurance and power when n = 120 under each expert and aggregated prior distribution.
Expert/ Aggregation	Scaled Assurance	Power
1	-	-
2	0.94	0.55
3	0.94	0.57
4	-	-
5	0.98	0.99
6	0.91	0.14
7	0.99	0.74
CM	0.96	0.51
SHELF	1.00	1.00

The elicited values demonstrated a level of agreement between experts. The judgements from the experts, excluding Expert 7, suggested more patients would be diagnosed using the Awaji criteria at the start of the trial (Eta), than after six months (Phi). The judgements also suggested that the BIMC test would provide consistent results, regardless of whether the Awaji criteria provided a positive diagnosis at the start of the trial or after six months (Theta 1 and Theta 2). The experts also expected that the BIMC test would provide fewer positive results when patients receive negative results from the Awaji Criteria (Theta 3) than when they received a positive result at either time point (Theta 1 and Theta 2).

The judgements of each expert reflect a belief that the BIMC test would offer an improvement to the Awaji Criteria. The least optimistic expert’s best estimate was that half of the patients who would otherwise be diagnosed in six months could instead be diagnosed at the start of the trial using the BIMC test. This suggests an overall positive view of the BIMC test’s potential in correctly identifying patients with MND earlier than is currently possible.

A meta-analysis presented in Geevasinga et al (2016) suggested the Awaji criteria have a sensitivity of 0.70 (95% CI: 0.51,0.83). All experts and aggregation methods produced estimates for the sensitivity of the test within this confidence interval, suggesting that clinician views on the Awaji criteria reflect the state of knowledge in the literature. The specificity of the Awaji criteria is higher than the sensitivity, with Boekestein et al (2010) finding it to be close to 1 (95% CI: 0.96,1). While the reason for this is unclear, it could be due to the specificity not being directly elicited (and thus the experts not realising they were implying a low specificity) or due to lower confidence in the Awaji criteria’s accuracy than previously reported. This could also be due to expert expectations as to the patients who would likely be recruited to the study, rather than the wider population of patients with MND.

The methods for elicitation and sample size choice presented in this paper offer many potential advantages over commonly used methods. Aggregation of expert opinion using the CM and SHELF has been shown to offer improved performance over individual experts (Williams, 2021). These combinations include a wider range of expert perspectives, better reflecting the overall state of knowledge in the field. By then incorporating these aggregated elicited opinions in assurance, the resulting sample sizes are more robust.

While a sample size is typically chosen to give a high value for statistical power, the true power may not be as high when the trial is conducted. This can occur when the effect size chosen in the power calculation does not reflect the true value. Underpowered trials can lead to incorrect conclusions being drawn, and are often attributed as one of the causes of the replication crisis (Maxwell, 2015). Assurance is less susceptible to this issue through the use of the prior distribution (Chen et al, 2018). By incorporating expert knowledge through the elicited distributions, rather than the use of a Minimal Clinically Important Difference, for example, researchers are more likely to avoid this discrepancy between the anticipated power and actual power of their test. From an ethical standpoint, researchers should aim to use the most appropriate sample size for a trial, and so a trial should not be underpowered.

There are challenges associated with implementing the methods we have presented. Elicitation and assurance calculations are more complex, organisationally challenging, and time-consuming to complete than standard approaches. As sample size calculations are typically completed before funding decisions, they often have to be completed without funding support. Additionally, the results are different to those that researchers, funding bodies and ethics committees are often used to interpreting. In practice, it can be more difficult to define a target assurance, as the maximum possible assurance varies based on the prior distributions used.

Further challenges surround the difficulty of expert elicitation. It can be a difficult task to quantify judgements, especially for researchers who have not done so before. Even if an expert can perfectly convert their judgments to probability distributions, there is no guarantee that what they think about a trial is correct. While online elicitations help overcome difficulties in recruiting experts to take part in an elicitation, they lead to additional challenges. For survey-style elicitations, where there is no direct contact between the experts and elicitors, it is more difficult to delve deeper into the expert’s knowledge or assist experts with any queries. Experts who find the elicitation confusing or difficult may not complete it, whereas if they had been meeting with the elicitor, they may have been able to provide valuable information.

Here we provide a statistical framework for evaluating novel diagnostic techniques for MND. Whilst the focus of this paper is BIMC and the development of tests for UMN disease, this approach can be applied to other emerging tests relevant to MND.

In MND, diagnostics require constant re-appraisal and refinement and are not only important for expediting diagnosis at symptom onset but also for characterising phenotype (i.e., extent and balance of LMN and UMN disease, speed of progression, etc.), monitoring subclinical disease progression and application as trial endpoints, all of which have the potential to reinvigorate MND trial design.

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Availability of data and materials

Results from the elicitation are provided in the supplementary materials.

Competing interests

The authors declare that they have no competing interests.

Funding

Not applicable.

Authors' contributions

CW, KW, and NW developed and ran the elicitation of parameters. MB, SJ, and TW participated in the elicitation exercise. CW analysed the results of the elicitations and calculated the sample sizes. All authors contributed to the writing and editing of the manuscript. All authors read and approved the final manuscript.

Acknowledgements

The authors would like to thank all those who participated in the elicitation exercise.

Abe, K., Aoki, M., Tsuji, S., Itoyama, Y., Sobue, G., Togo, M., ... & Iwasaki, T. (2017). Safety and efficacy of edaravone in well defined patients with amyotrophic lateral sclerosis: a randomised, double-blind, placebo-controlled trial. The Lancet Neurology, 16(7), 505-512.
Alhussain, Z. A., & Oakley, J. E. (2020). Assurance for clinical trial design with normally distributed outcomes: Eliciting uncertainty about variances. Pharmaceutical Statistics, 19(6), 827-839.
Baker, M. R., & Baker, S. N. (2003). The effect of diazepam on motor cortical oscillations and corticomuscular coherence studied in man. The Journal of physiology, 546(3), 931-942.
Baker, S. N., Olivier, E., & Lemon, R. N. (1997). Coherent oscillations in monkey motor cortex and hand muscle EMG show task-dependent modulation. The Journal of physiology, 501(Pt 1), 225.
Bensimon, G., Lacomblez, L., Meininger, V. A. L. S., & ALS/Riluzole Study Group. (1994). A controlled trial of riluzole in amyotrophic lateral sclerosis. New England Journal of Medicine, 330(9), 585-591.
Boekestein, W. A., Kleine, B. U., Hageman, G., Schelhaas, H. J., & Zwarts, M. J. (2010). Sensitivity and specificity of the ‘Awaji’electrodiagnostic criteria for amyotrophic lateral sclerosis: retrospective comparison of the Awaji and revised El Escorial criteria for ALS. Amyotrophic Lateral Sclerosis, 11(6), 497-501
Bojke, L., Soares, M., Claxton, K., Colson, A., Fox, A., Jackson, C., ... & Taylor, A. (2021). Developing a reference protocol for structured expert elicitation in health-care decision-making: a mixed-methods study. Health Technology Assessment (Winchester, England), 25(37), 1.
Brooks, B. R. (1994). El Escorial World Federation of Neurology criteria for the diagnosis of amyotrophic lateral sclerosis. Subcommittee on Motor Neuron Diseases/Amyotrophic Lateral Sclerosis of the World Federation of Neurology Research Group on Neuromuscular Diseases and the El Escorial" Clinical limits of amyotrophic lateral sclerosis" workshop contributors. Journal of the neurological sciences, 124, 96-107.
Chen, D. G., Fraser, M. W., & Cuddeback, G. S. (2018). Assurance in intervention research: A Bayesian perspective on statistical power. Journal of the Society for Social Work and Research, 9(1), 159-173.
Conway, B. A., Halliday, D. M., Farmer, S. F., Shahani, U., Maas, P., Weir, A. I., & Rosenberg, J. R. (1995). Synchronization between motor cortex and spinal motoneuronal pool during the performance of a maintained motor task in man. The Journal of physiology, 489(3), 917-924.
Cooke, R., Mendel, M., & Thijs, W. (1988). Calibration and information in expert resolution; a classical approach. Automatica, 24(1), 87-93.
Fisher, K. M., Zaaimi, B., Williams, T. L., Baker, S. N., & Baker, M. R. (2012). Beta-band intermuscular coherence: a novel biomarker of upper motor neuron dysfunction in motor neuron disease. Brain, 135(9), 2849-2864.
Flandoli, F., Giorgi, E., Aspinall, W. P., & Neri, A. (2011). Comparison of a new expert elicitation model with the Classical Model, equal weights and single experts, using a cross-validation technique. Reliability Engineering & System Safety, 96(10), 1292-1310.
Geevasinga, N., Loy, C. T., Menon, P., de Carvalho, M., Swash, M., Schrooten, M., ... & Vucic, S. (2016). Awaji criteria improves the diagnostic sensitivity in amyotrophic lateral sclerosis: a systematic review using individual patient data. Clinical Neurophysiology, 127(7), 2684-2691.
Issa, N. P., Frank, S., Roos, R. P., Soliven, B., Towle, V. L., & Rezania, K. (2017). Intermuscular coherence in amyotrophic lateral sclerosis: A preliminary assessment. Muscle & Nerve, 55(6), 862-868.
Jaiser, S. R., Baker, M. R., & Baker, S. N. (2016). Intermuscular coherence in normal adults: variability and changes with age. PLoS One, 11(2), e0149029.
Katz, D. J. S. M., Baptista, J., Azen, S. P., & Pike, M. C. (1978). Obtaining confidence intervals for the risk ratio in cohort studies. Biometrics, 469-474.
Kilner, J. M., Baker, S. N., Salenius, S., Hari, R., & Lemon, R. N. (2000). Human cortical muscle coherence is directly related to specific motor parameters. Journal of Neuroscience, 20(23), 8838-8845.
Kilner, J. M., Fisher, R. J., & Lemon, R. N. (2004). Coupling of oscillatory activity between muscles is strikingly reduced in a deafferented subject compared with normal controls. Journal of neurophysiology, 92(2), 790-796.
Maxwell, S. E., Lau, M. Y., & Howard, G. S. (2015). Is psychology suffering from a replication crisis? What does “failure to replicate” really mean? American Psychologist, 70(6), 487–498. https://doi.org/10.1037/a0039400
McNemar, Q. (1947). Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika, 12(2), 153-157.
Miller, R. G., Munsat, T. L., Swash, M., & Brooks, B. R. (1999). Consensus guidelines for the design and implementation of clinical trials in ALS. Journal of the neurological sciences, 169(1-2), 2-12.
Mima, T., Simpkins, N., Oluwatimilehin, T., & Hallett, M. (1999). Force level modulates human cortical oscillatory activities. Neuroscience letters, 275(2), 77-80.
Nodera, H., Izumi, Y., & Kaji, R. (2007). New diagnostic criteria of ALS (Awaji criteria). Brain and nerve= Shinkei kenkyu no shinpo, 59(10), 1023-1029.
O'Hagan, A., Stevens, J. W., & Campbell, M. J. (2005). Assurance in clinical trial design. Pharmaceutical Statistics: The Journal of Applied Statistics in the Pharmaceutical Industry, 4(3), 187-201.
O’Hagan, A. (2019). Expert knowledge elicitation: subjective but scientific. The American Statistician, 73(sup1), 69-81.
Proudfoot, M., van Ede, F., Quinn, A., Colclough, G. L., Wuu, J., Talbot, K., ... & Turner, M. R. (2018). Impaired corticomuscular and interhemispheric cortical beta oscillation coupling in amyotrophic lateral sclerosis. Clinical Neurophysiology, 129(7), 1479-1489.
Riddle, C. N., Baker, M. R., & Baker, S. N. (2004). The effect of carbamazepine on human corticomuscular coherence. Neuroimage, 22(1), 333-340.
Shefner, J. M., Al-Chalabi, A., Baker, M. R., Cui, L. Y., de Carvalho, M., Eisen, A., ... & Kiernan, M. C. (2020). A proposal for new diagnostic criteria for ALS. Clinical Neurophysiology, 131(8).
Van Es, M. A., Hardiman, O., Chio, A., Al-Chalabi, A., Pasterkamp, R. J., Veldink, J. H., & Van den Berg, L. H. (2017). Amyotrophic lateral sclerosis. The Lancet, 390(10107), 2084-2098.
Williams, C.J., Wilson, K.J. and Wilson, N. (2021), A comparison of prior elicitation aggregation using the classical method and SHELF. J. R. Stat. Soc. A, 184: 920-940. https://doi.org/10.1111/rssa.12691
Witham, C. L., Riddle, C. N., Baker, M. R., & Baker, S. N. (2011). Contributions of descending and ascending pathways to corticomuscular coherence in humans. The Journal of physiology, 589(15), 3789-3800.

No competing interests reported.

SupplementaryMaterial.docx

Expert Opinion in the Design of a Motor Neurone Disease Diagnostic Study

Status:

Version 1

Abstract

Background

Methods

Results

Conclusions

Figures

Background

Methods

2.1 Participants

Results

Discussion

Conclusions

Declarations

Ethics approval and consent to participate

Consent for publication

Availability of data and materials

Competing interests

Funding

Authors' contributions

Acknowledgements

References

Additional Declarations

Supplementary Files

Status:

Version 1