If models and expert judgements are not separable, because they are co-developed in the way I described above, then they do not form fully independent sources of information. This has two consequences:
First, it undermines the semi-formal use of expert judgement to evaluate model output. By this I mean qualitative model evaluations determining that one pattern of behaviour is “more realistic” than another, or that outputs lying outside a particular range are “unlikely”.
For example, if all of our previous models have shown values of a certain parameter between 1.5 and 4, then we will naturally examine more closely (and perhaps tweak or recalibrate) those that show it to be outside that range. This generates an effective bias towards the “accepted” values.
Second, it undermines quantitative statistical methods for model analysis which effectively assume that ensembles constitute “random samples” of some informative model space. Because models are based on the same expert judgements, and for other reasons outlined elsewhere, they are not a random sample of anything. In fact, Mauritsen and Roeckner (2020) describe how the climate of one CMIP6 model, MPI-ESM-1.2, was tuned with the specific target of a climate sensitivity of 3.0, in the middle of the expected range and consistent with observation-derived estimates from the twentieth century.
In the words of Donald MacKenzie (2008), this kind of model is “an engine, not a camera,” in the sense that the model itself, which could have been constructed in innumerably many different ways but happens to be constructed in this particular way for a set of particular reasons, is a key driver of the framing and understanding of the situation it models.
We are in a feedback loop, in which the success of the model in reflecting our expert judgement back at us is amplified by the use of expert judgement to assess the quality of the model. Of course, this feedback loop is still connected to reality via observational constraints, so I do not argue that we are in a position of complete epistemic ignorance. My own overall feeling (expert judgement, if you will) is that plausible uncertainty ranges for parameters such as climate sensitivity are simply underestimated, especially at the end of the range which is furthest from observational experience.
I offer two potential ways out of this feedback loop, though neither is easy.
The first is to cut it in the middle, by disallowing all forms of expert judgement as in-flight model evaluation tools. I propose that this could be achieved by following a procedure such as pre-registration of all model experiments prior to the development of a model. Pre-registration would include advance commitment to
the inputs, processes and outputs of the model;
the observational data to be used for tuning and calibration;
the observational data to be used for out-of-sample assessment;
an evaluation metric by which the above data are to be compared with the model output;
“blind” model development, either without running and testing routines, or by subcontracting it to a programmer who will be able to implement the proposal without making domain-expert judgements about the quality of the output;
any other advance stipulations about quality (“disregard all models with a negative climate sensitivity”; “disregard all models where the North Atlantic is frozen”), with clear justification for each;
publication of all results regardless of perceived quality.
No doubt further expert judgements would then follow. This procedure does not prevent us tuning the model to any pre-specified desirable outcome (such as fit to twentieth century climate) but it would prevent us from revisiting tuning parameters multiple times in order to get the best-looking result.
A second way out of the feedback loop would be to create other possibilities: accept the contingency on expert judgement, but actively make attempts to explore other possible regions of model space. Previous attempts to do this have been extremely minimal, generally limited to small perturbation of parameters within an existing model. I propose a much wider programme. For example, taking the current suite of state-of-the-art climate models based on atmospheric fluid dynamics as our default, could we instead imagine a “climate model” which is instead primarily based upon highly detailed representations of the biosphere, with the atmosphere parameterised down to a couple of key equations? This would have to be developed from existing ecological models, in the way that present state-of-the-art climate models have developed from numerical weather prediction models.
Another way to create other possibilities would be to fund the creation of a new climate centre, based in say Africa or Bangladesh, staffed with people who have specifically not had experience of conventional (European/American-style) climate modelling studies, and tasked with the preparation of policy-relevant climate information in any way they deem appropriate. This is something of a “moon shot” but with time to trust the process and develop a clarity of purpose, I believe it could have a deeply energising effect, generate significant new thought and improve the integration of non-Western viewpoints into global decision processes.