Analogical Reasoning as an Inference Scheme

Abstract Despite its importance in various fields, analogical reasoning has not yet received a unified formal representation. Our contribution proposes a general scheme of inference that is compatible with different types of logic (deductive, probabilistic, non-monotonic). Firstly, analogical assessment precisely defines the similarity of two objects according to their properties, in a relative rather than absolute way. Secondly, analogical inference transfers a new property from one object to a similar one, thanks to an over-hypothesis linking two sets of properties. The belief strength in the conclusion is then directly related to the belief strength in this meta-hypothesis.


Introduction
Reasoning by analogy is a usual mode of reasoning, explicit or implicit in epistemic practice. Many examples punctuate the history of science, for instance, the parallelism of structure between sound and light or between hydraulic and electric systems. Some typical applications are at work in everyday life, for instance, when a child learns a new language or when a lawyer compares different normative situations. Various analogies are used to exemplify some ideas, for instance, when comparing natural The French-language translation of this article can be found at https://doi.org/10.1017/S0012217322000294. and artificial selection. Finally, analogies help to form perceptions, to build taxonomies, to infer relations, and to solve problems.
More precisely, three functions of analogies and analogical reasoning may be considered according to their function in the reasoning process. The didactic function aims at providing a simple, compact, and evocative image, either realist or poetic, of some complex phenomenon, in order to fulfill a communicative or pedagogic aim. The heuristic function suggests ex ante the possible existence of some new property possessed by an object when it is similar to another one in other respects. The argumentative function sustains ex post the belief in a new property attributed to some object on the basis of its similarity to another object. Focusing on the second and third functions, this article considers "reasoning by analogy" as an inferential process.
Of course, any analogical reasoning can be judged as more or less relevant, according to some philosophical insights. Mary Hesse (1966) asserts that some deep analogical reasoning forms the core of scientific research. Conversely, Jacques Bouveresse (1999) observes that several analogies in sociology result from a fanciful mode of reasoning. John D. Norton (forthcoming) thinks that analogies and analogical reasoning must be accepted or discarded through a one-by-one material judgement, not through any general scheme. Many other works have tried to link and even to reduce analogical reasoning to more classical forms of reasoning, while some others view it as a specific mode of reasoning (see Bartha, 2013, for a review). However, no satisfactory and unified formal representation of it has yet been provided.
The most current intuitive theory of analogical inference states that a good analogical inference relies simply on a "good analogy," which means it relies on the associated fact that the two objects share many common properties. But it is obvious that, in many situations, this simple assertion does not support even intuitively the fact that they will share any specific other property: some additional premises are required. And yet, several commentators have added structure to analogical assessment and inference by identifying qualities that make it more robust and relevant. Hesse proposed, for instance, a tabular representation of an analogical argument, which separates the source domain and target domain, each including a set of objects, properties, and relations. She defined "vertical relations" as the relations between properties within each domain, and "horizontal relations" as the relations between the domains. Then she formulated several qualitative requirements for an analogical argument to be acceptable.
Paul Bartha (2010Bartha ( , 2013 stressed that Hesse's theory and more generally what he called the "common sense guidelines" of analogical reasoning rely too heavily on horizontal relations and on vague concepts used for expressing what a good analogy is. So, he proposed a more complex "articulation model," focusing on the different types of vertical relations in the source and target domains, that is, on the types of links between shared properties and the projected property. However, his analysis relied on a list of qualitative criteria that failed to give a formal representation that could show how these criteria are to be combined in order to vindicate a precise conclusion. Furthermore, these criteria still depend on some other ambiguous concepts, such as "essential," "causal," "relevant," and "critical" factors. We share with most philosophers the view that analogical reasoning should not be redundant. But Bartha's analysis is not very clear on this essential point: the source object satisfying the projected property should be necessary for the conclusion of an analogical argument. This idea of non-redundancy is the starting point for Todd R. Davies and Stuart J. Russell (1987), who proposed a general formal analysis of analogical reasoning, based on "determination rules," linking the shared properties and expected properties of analogous objects. It results, however, in a purely deductive account of analogical reasoning viewed as an enthymeme that is criticized by Bartha because the missing premise of this enthymeme is in general not available in the background knowledge. Furthermore, their formal representation of the determination rules is subject to an ambiguity discussed below.
Our article follows the spirit of Bartha's and Davies and Russell's work on two points: the importance of vertical relations and the role of determination rules. However, it proposes a more formal and universal analysis that applies even if one does not have any knowledge or strong belief in these rules, which as such escapes a reduction to deductive logic. It offers a purely syntactical analysis of analogical reasoning that differs from works that study the pragmatic aspects of it or the rhetorical patterns of concrete examples, such as in the argumentation theories expressed by Henrique Jales Ribeiro (2014) and Fabrizio Macagno et al. (2017).
Following Willard Van Ornam Quine's (1969) analysis of the difficulties entailed by a logical definition of absolute analogy or similarity, we propose a relative concept of analogical assessment that avoids these difficulties, and better captures how analogical reasoning works. We will show how different kinds of analogical assessments, including those where the properties of analogous objects are prima facie different, can be cast into the same universal definition of relative analogy. In this way, this analysis offers a conception of analogical reasoning that does not rely on a never-ending list of criteria to define what a "good" analogy is (Norton, forthcoming).
An "analogical inference" relies on the assessment of such "relative analogy" in order to justify the transfer of some additional property from one object to the other, by taking into account an explicit background over-hypothesis that associates them. These two steps are strongly linked because an analogical assessment may prepare an analogical inference. Such procedure allows one to evaluate rational belief in its conclusion with regard to its assumptions.
Our thesis is that analogical reasoning should not be reduced to any specific logic, meaning a specific relation of entailment. It is rather an inference scheme that can generally be associated with different kinds of relations of entailment, according to the belief status of the background over-hypothesis.
In Part 2, a general framework is introduced and different forms of analogical assessment are considered, with absolute analogy discarded in favour of relative ones. Part 3 deals with analogical inference: a general inference scheme pattern is sketched, linking target and source properties through a background hypothesis; a deductive specification of this pattern is detailed using an over-hypothesis as the relevant background hypothesis; and a probabilistic account of this inference scheme is given. Part 4 compares analogical reasoning to single-case induction, i.e., induction relying on a single positive confirmation instance. Finally, in Part 5, conclusions are suggested about the specificity of our approach, with some insights for future analysis.

General Framework
We adopt the framework of first-order logic. We assume the existence of a universe X of objects denoted A, B, and so on, which are the constants of the language. Objects can be concrete things (people, cars, trees) as well as conceptual entities (numbers, propositions, values). Objects can be specific (John, my car, the Eiffel tower) or generic (a man, a car, a monument). Note that a specific object is just a given entity, while a generic object is already a class of specific ones.
We suppose moreover that a set P of properties, defined within the universe X of objects and denoted P, Q, and so on, is given. Properties are represented by one-place predicates P(A), which are true or false. They can concern either concrete aspects (to be red, to be heavy) or conceptual ones (to be greater than 3, to be nice). For a given object X, a property P is said to be relevant or not according to whether it applies to that object. For instance, a ball is red or not, but this property is irrelevant for a number.
We finally introduce some relations between objects of a same set X or between objects of two different sets X and Y denoted R, S, and so on. Relations are two-place predicates R(A,B) or even n-place predicates. As before, they may be of any concrete or conceptual kind. For instance, the husband is related to his wife, a car to its owner or to its registration number, a number to its square.
The first and simplest analogical statement is called "notional analogy." It expresses that "A is like B," written "A ∼ B," and just states that there is a specific kind of similarity between two specific objects (John is like Ophelia) or between two generic ones (an airplane is like a bird) or even between mixed ones (this building is like a ship). The two objects are compared with respect to their relevant properties. For a given property, the two objects are covalent if both satisfy it or do not satisfy it, and contravalent in the other case.
A second and more elaborate form is called "relational analogy." It expresses that "A is to B what C is to D," written "A:B :: C:D," and actually points to a similar relation within two couples of objects. It can concern specific objects (Dante is to Italy what Shakespeare is to England), generic ones (a hoof is to a horse what a foot is to a man) or even mixed ones (beer is to Belgium what wine is to France). In these examples, A and C belong to a set X while B and D belong to another set Y. But the four objects may belong to the same set X when considering, for instance, a filiation relation: Alice is to Bob what Mary is to John (their daughters).
This form is in fact not logically different from the previous one since it can be re-written as a notional analogy between two couples: "(A,B) ∼ (C,D)" based on a similar relation within each couple. Relational analogy has often been assimilated with analogy in general, since it gave its name to the concept in Aristotle's work, αναλογία applying to an identity in proportion. That is why it is often called "proportional analogy" in the literature, though it does not rely exclusively on the numerical concept of proportion.
Within the present formal framework, it is easy to extend relational analogy to relations between two ordered sets of exactly n objects and to state that (X 1 … X n ) ∼ (Y 1 … Y n ).
In the same spirit but outside the present framework, it should be possible to extend these definitions to analogies between properties such as P 1 (X) : P 2 (X) :: P 3 (X) : P 4 (X). More generally again, "structural analogies" could express analogies within scientific models, which include a whole set of analytical relations between several properties of an object treated as variables. The simplest case in science is a quantitative expression of a relational analogy such as the length of a stick of iron relates to its temperature in the same way that the length of a stick of copper relates to its temperature, where the relation is in fact a linear one.
However, the compared properties between the two objects need not be the "same properties" but only "corresponding properties." This fact was pointed out by several authors (Bartha, 2010;Juthe, 2005;Hesse, 1966). The problem is then to define rigorously this intuitive but vague notion of "correspondence" between properties, and to show that this just induces a refinement to our general definition of analogical statements.
Let us start with a simple example. A relational analogy states that lungs are to aerial animals what gills are to aquatic animals. The properties of lungs and gills are indeed different since their associated chemical transformations are not the same. But lungs make it possible for aerial animals (A) to breathe in the air (P 1 ), while gills make it possible for aquatic animals (B) to breathe in the water (P 2 ). They share a common property; "making extraction of oxygen possible" from air or from water.
Such a correspondence can be formalized very easily. Let P be a domain of properties over a domain of objects X, P 1 be the restriction of P on X 1 ⊂ X, P 2 be the restriction of P on X 2 ⊂ X. Then: if P 1 ϵ P 1 and P 2 ϵ P 2 , it will be said that: "P 1 in P 1 corresponds to P 2 in P 2 "iff there exists a property P in P, such that ∀ X ϵ X 1 , P 1 (X) → P(X) and ∀ X ϵ X 2 , P 2 (X) → P(X) If moreover, P 1 (A) and P 2 (B), then P(A) and P(B), hence A∼B.
Analogies with corresponding properties are frequent in science, for instance, when the equations of different domains express the same mathematical relations between obviously different but "corresponding" variables. Consider, for example, electric and hydraulic networks. One can say that they share a common insight, that of "constrained flows." Electric intensity is like hydraulic debit, as they both imply a quantity of fluids. The tension is like the pressure variation because they both imply a potential of movement. Moreover, the law of nodes (for intensities as well as debits) and the law of loops (for tensions as well as pressure variations) apply to both of them. Finally, Ohm's law linking linearly, intensity to tension, is analogous to the law linking linearly, debit to pressure variation.
This concept may also help to understand the difference often quoted in the literature between "formal analogy," when associated properties (variables) of two models are linked by any kind of two-by-two horizontal relation, and "substantial analogy," when associated properties are moreover "corresponding." For instance, consider "Newton's Law" (that the force between two bodies is proportional to their masses and inversely proportional to the square of their distance). It bears a substantial analogy with "Coulomb's Law" (that the force between two electric charges is proportional to their charges and inversely proportional to their distance), since both reflect attractive forces between masses or charges. At the contrary, it bears a formal analogy with "traffic law" (that the traffic between two towns is proportional to their populations and inversely proportional to their distance), since the underlying phenomena have no common interpretation.

Absolute Analogy
Tracing back to George Hayward Joyce (1936), quoted by Norton, an analogic assessment has often been considered as an absolute judgement such as "A resembles B in being P." It may differ from mere similarity on the basis that it is a judgement intended to prepare an analogical inference: "A is Q, B resembles A in being P, therefore B is Q." Hence, for instance, it may involve the use of more criteria such as multiple similarities or few counter-similarities.
However, following Quine, it can be shown that it is not possible to propose any logical definition of absolute analogy, since it is not even possible to give a logical definition of the weaker notion of similarity. The simplest intuitive definition is: Analogy is defined by the existence of a property commonly shared by both objects. This definition is far too lax and even trivial: it is always possible to find a common property between any two objects.
A more restrictive attempt is the following: (2) A ∼ B iff for any property P, P(A) ↔ P(B) But this universal covalence of properties leads to an extreme situation that reduces analogy to identity.
Quine proposes an intermediary definition of similarity: (3) A ∼ B iff A and B have "many" common properties But, as he highlights, this notion is too vague because one cannot tell how many properties are required. In fact, the question is to determine what counts as a property. If any set of objects counts as a property, then any two objects will be members of an arbitrary number of sets and will share the same number of properties. If one restricts the type of sets to the properties collecting similar objects, we are led to a circular definition.
An attempt to escape the problem faced by those general definitions may restrict the admissible properties to a unique subset of externally defined properties that are relevant for any of two objects, say W. Hence, amended formulations of (1), (2), and (3) may be: (1a) A ∼ B iff there exists a property P belonging to W such that P(A) & P(B) (2a) A ∼ B iff for any property P belonging to W, P(A) ↔ P(B) (3a) A ∼ B iff A and B have "many" common properties P belonging to W Clearly, (1a) may prevent the triviality of (1), (2a) may prevent the reducibility of (2) to identity, and (3a) may prevent the arbitrary nature of (3). The question then is how to give a relevant definition for a universal W, since universality is required for a definition of absolute analogy.
An intuitive way to do this would be to identify W with the set of "natural kinds." Quine considered this approach when he pointed out the intuitive relationship between this concept and the notion of similarity: a natural kind is a collection of similar objects, and conversely similar objects seem to be those very objects that are instances of the same natural kind. However, this approach leads to many philosophical issues and is highly controversial, unless one accepts a very specific essentialist position (see Bird & Tobin, 2015, for a critical presentation of this position).
Firstly, the properties that define natural kinds are supposed to be the properties that are "really important" for classifying objects in "genuinely natural ways." But critics of this position deny that any of our classifications are natural. Classifications are mere human constructs built within current language and science for practical purposes and not essential and eternal entities in the world as Plato thought (see, for instance, Dupré, 1993).
Secondly, Quine's argumentation leads to the conclusion that defining natural kinds relies again on too vague notions or too obvious circularities with the notion of similarity. Hence, there is neither a philosophical nor a logical way to define "natural kinds," and to identify which set of properties could be a proper W.
But a third reason that it is impossible to find a formal definition of absolute analogy arises from a consideration of what happens when an analogy is refuted by a counteranalogy. The counter-analogy suggests a better partner than the one proposed, for the source as well as for the target, generally by exhibiting more relevant properties.
For instance, for a notional analogy: -Bruges is the Venice of the north; it is a city built on canals.
-No, Bruges is not the Venice of the north; it is a city that never had any major economic influence. -It is Antwerp that is the Venice of the north, since it was a European economic capital city like Venice.
These debates underline the vacuity of vindicating any absolute analogy: even if A and B were similar with respect to one point of view, they would usually differ from another point of view. No pair of non-identical objects are similar from the standpoint of all possible properties, even if we limit these properties to current categories: is an apple similar to a pear because it is a fruit, or to a tennis ball because it is round?

Relative Analogy
So, we are led to consider that any analogy must always be expressed with respect to one property or to a domain of properties. This is obviously an appropriate answer to the debates on analogical statements: there is no assertion stated from a universal standpoint but only from relative points of view on the similarity between two objects.
The simplest way to represent relative analogy for notional analogy is again: For instance, an apple is like a pear, relative to "fruitness," i.e., they are fruits.
For relational analogy, the condition states: For instance, Paul is to Ana what Bob is to Julia, relative to "sonness," i.e., he is her son. But on second thought, this seems to be a very restrictive way to express things. Indeed, an analogy is significant because it spotlights that two objects share one particular property among a list of other possible properties all pertaining to a certain way of describing the objects. These two cars are analogous relative to their colour if they are both blue. These two animals are analogous relative to their species if they are both dogs. But even if it is true, it seems odd to say that these two cars are analogous relative to their "blueness" or that these two animals are analogous relative to their "dogness." What is indicated more strongly by a relative analogy is the fact that two objects are analogous relative to some "point of view" that can be expressed by a set of complementary properties (for example, colour or animal species). They share the same property in this set while they could have two different ones (one car could be blue and the other one red, one animal could be a dog and the other one a cat). What is stressed by the analogy is that it is not the case that they have two different properties inside the set that is considered. So, these cars are analogous relative to their colour. These animals are analogous relative to their species.
Let's define a domain Z as a set of possible disjointed properties 2 that are associated with the same point of view. The relativization of any analogy to a given domain expresses the speaker's intention to choose a specific point of view and her intention to speak only with regard to this aspect of the world. A "point of view" is a mental attitude that consists in applying a filter on the properties of things or events. This notion is represented by the set Z, which is not any set of properties, but a set of disjointed properties that are then correlated through this disjunction.
Finally, relative notional analogy can be defined by: For instance, an apple is like a pear, with regard to vegetal kinds, i.e., they are fruits. Likewise, relative relational analogy is defined by: For instance, Paul is to Ana what Bob is to Julia, with regard to family relationship, i.e., he is her son. Moreover, the combination of two elementary points of view by considering the product of two sets, Z 1 and Z 2 , form again a composite point of view Z. Such situations look a little bit more complex. For example: -An apple is like a pear, with regard to vegetal kinds and colours. They are yellow fruits.
-Paul is to Ana what Bob is to Julia, with regard to family relationship and social relationship. He is her son and he doesn't care about her.
These situations can be more simply expressed by using a conjunction of predicates: to be yellow and to be a fruit for the first, and to be a son and not to care for his mother, for the second. The unique point of view, Z, is the product of disjointed conjunctions formed by using two types of properties or relations, colours and botanic species in the first, with family relationships and attitudes toward someone else in the second. But these composite points of view Z may nevertheless appear as heterogeneous combinations of elementary ones. In many concrete examples, Z is formed from correlated elementary points of view. The last are linked by the fact that they are part of a similar "structure" or contribute to a similar "function" shared by the compared objects. This is the case for electric and hydraulic networks that are analyzed from the composite point of view of flows in a network. Such a consideration is a semantical one and will be studied in further work.
One can check that relative notional analogy expressed by (1c) or by (1c') satisfies the following minimal principles required for a relevant definition of analogy: -It is neither reduced to identity nor to triviality.
-It is not circular since Z is chosen by the agent with respect to the point of view she wants to stress and does not need analogy to be defined.
Moreover, it is easily checked that notional relative analogy is an equivalence relation satisfying the following principles: -Reflexivity: A ∼ z A (an airplane is like an airplane).
-Symmetry: if A ∼ z B, then B ∼ z A (if an airplane is like a bird, then a bird is like an airplane). -Transitivity: if A ∼ z B and B ∼ z C, then A ∼ z C (if an airplane is like a bird and a bird is like a bee, then an airplane is like a bee).
For a relational analogy, these principles become: These principles are not pre-requisites for our definition of relative analogy but result logically from this definition, which has its own justification. They can all be considered to be at least compatible with a relevant logical analysis of analogical statements. This must be analyzed independently of any pragmatic considerations of assertability and intentionality.
Indeed, reflexivity is obvious: an object is at least analogous to itself, even if asserting this property may be strange and useless in the conversation. As concerns symmetry, it is generally accepted that analogy satisfies this property ex post but not ex ante. Analogy is associated with an illocutionary intention that distinguishes a part that inherits a property from a part for which the property is well known. When one says "your eyes are blue like the sky," the well-known property of the intense blue of the sky is attributed to someone by way of giving her a compliment. However, this remark concerns the intentional aspect contrary to a purely formal syntactic point of view adopted here. As for transitivity, it comes from the fact that Z is defined as a list of disjointed properties. It is noticeable that transitivity is not respected with the usual definition of absolute analogy since the properties shared by A and B are not necessarily the same as those shared by B and C. Since relative analogy is based on similarity of properties, this principle becomes quite intuitive.
In the sweep of philosophical literature, contextualistic positions have been developed in an attempt to explain that knowledge, belief, or truth is related to a "context." It is a relativist conception that refers to the idea that the truth or belief conditions of sentences "vary in certain ways according to the context in which they are uttered" (DeRose, 1999). But even if it looks to have the same flavour, the notion of domain used in relative analogy is not contextualistic: it does not express the fact that the absolute analogical assertion is only true in a certain context but that there is no such logical thing as an absolute analogical assertion. Speaking about analogies in a coherent way is always to speak about relative analogies, and those are more or less acceptable independent of any context.

The General Inference Scheme
Typically, analogical inference consists in a similarity between two objects as a premise for inferring new similarities as conclusions. In this way, an analogical assessment is taken as a kind of similarity judgement that justifies its extension to other properties: a point that differentiates "mere similarities" from "analogies" or "relevant similarities." As expressed by Bartha: "An analogical argument is an explicit representation of analogical reasoning that cites accepted similarities between two systems in support of the conclusion that some further similarities exist" (Bartha, 2010, p. 1).
Analogical inference uses analogical assessment in order to "transfer" the properties of an object to another object. The analogical statement may be explicit or not, and represented by facts that imply it, as illustrated, for instance, by the "violinist" argument (Thomson, 1971, passim). Let Q be a new one-place predicate, or S a new two-place predicate. In the two basic forms (notional analogy and relational analogy), an analogical inference states respectively: Analogical inference inherits the intentional asymmetric nature of analogical statement: it is based on the property of the "source" transferred to the "target." The symbol ➥ denotes a relation of entailment between premises and conclusion, which has to be further characterized.
For instance, from the fact that a pear is like an apple relative to fruitness and that an apple can be eaten, one could infer that a pear can be eaten. Likewise, from the fact that a pear is to a pear tree what an apple is to an apple tree relative to production and that an apple tree grows by sowing apple seeds, one could infer that a pear tree grows by sowing pear seeds.
For structural analogy, a new variable in the source model suggests a new variable in the target modelthese two variables being in correspondence. Moreover, a relation including this new variable in the source model can be transposed into a relation with the corresponding new variable in the target model. For instance, for an electric network, the power is the product of tension and intensity. For a hydraulic network, the corresponding variable is the same variable, which is, power, and is the product of potential of movement and debit.
Coming back to notional analogy, for the very same reasons that led us to introduce a domain Z listing the properties considered for defining the relative analogy that is the premise, we introduce a domain Z' listing the properties that will be considered for the conclusion. The analogical reasoning (4) can then be developed into: Why should we accept this inference scheme? Our answer does not rely on the construction of a new consequence relation but on an external supplementary hypothesis HE, used by the person who argues in favour of the conclusion. This hypothesis HE will be integrated in the reasoning as follows (for notional analogies): The symbol ➲is used instead of ➥in order to acknowledge the fact that including HE in the premises is intended to give a better epistemic status to the inference scheme. The extension to relational analogies or to structural analogies is obvious, replacing P and Q with two-place or n-place predicates. Formulas (6) and (7) are the general inference scheme patterns that can be transformed into a formal inference scheme of a precise logic as soon as the arrow ➲ is replaced by a precise relation of entailment.
After identification of the content of HE, we will study the relationships between HE and two kinds of relations of entailment that can validate it, deduction or probabilistic entailment. These two cases are to be considered only as typical examples of the several ways by which this general inference scheme could be activated, our general thesis being independent of which one is actually chosen by the agent.
Other logical frameworks such as non-monotonic logic (Kraus, Lehmann, & Magidor, 1990) or belief revision (Alchourrón, Gärdenfors, & Makinson, 1985) could be considered in further research. Our point is that analogical reasoning should not be reduced to any such and such existing or new logic, associated to a specific relation of entailment, but to an inference scheme general pattern, which can be validated by different kinds of relations of entailment.

Structure of the Over-Hypothesis in a Deductive Framework
In order for HE to be relevant for explaining how analogical inference runs, we impose that it obeys the two following additional principles: non-redundancy: HE must not "trivialize" the reasoning in a way that would make unnecessary the consultation of one of the premises. testability: the empirical protocol for believing in HE must be coherent, and able to be described.
We consider now increasingly sophisticated expressions of HE. Let's examine a first candidate: Using HE = HE1 within (7) makes the analogical inference scheme valid in deductive logic. But knowing that P(A), the conclusion Q(A) is acquired without need to refer to P(B). Hence, non-redundancy is violated. Analogical inference becomes a mere "focusing" operation from a prior generic belief in a specific case.
To avoid redundancy, Davies and Russell propose an interesting candidate for HE, called the "determination clause." It is initially written as follows: (9) HE2: [∀X (P(X) →Q(X))] or [∀X (P(X) →-Q(X))] Using HE = HE2 within (7) makes the analogical inference scheme again valid in deductive logic but it involves now all premises: it is necessary to use Q(B) to infer Q (A). For instance, if HE2 states that any gold object is insensible to acid or any gold object is sensible to acid, the fact that my watch is gold and insensible to acid implies that yours, which is also gold is insensible to acid.
How is it possible to acquire the knowledge of a hypothesis like HE2? If all X such as P(X) have been observed, the fact that Q(X) or -Q(X) for all these X is already known and there is no need for the analogical reasoning to know that Q(A). Of course, if -Q(X) is the case, HE2 is irrelevant for this analogical reasoning, which is false. But if the fact that all X are such that Q(X) or if the fact that all X are such that -Q(X) is already known, the belief in HE2 is reduced to the belief in one of the two possibilities, either to HE1 or to its contrary and we are back to redundancy. If only some X such as P(X) have been observed and if all are such that Q(X) or all such that -Q(X), the belief in HE2 may stem from an inductive process leading possibly to a probabilistic belief but, again, the belief will concern only one of the two possibilities mentioned in HE2. Every empirical observation that does not refute HE2 will lead to belief either in its first part or in its second part. Then there is no way to learn a hypothesis such as HE2 as it is. Hence, testability is violated.
It follows that H2 is not the right over-hypothesis that one needs in order to complete the analogical reasoning. This is again a question of the right level of properties to express things. The domains Z and Z', which were duly structured as "points of view," have a role to play in the over-hypothesis. A more relevant over-hypothesis, whose major difference with HE2 is not mentioned by Davies and Russell, is then the following: (10) HE3: for any P in Z, for any Q in Z', [∀X, (P(X) →Q(X))] or [∀X, (P(X) →-Q (X))] Using HE = HE3 within (7) makes the analogical inference scheme valid in deductive logic once more. Note that the apparent use of second-order logic does not raise specific problems here as long as, in this framework, sets of predicates within Z and Z' are supposed to be finite. Indeed, in this case, quantification on predicates can be considered as a scheme of axioms that can be developed into a finite list of axioms of first-order logic, each explicitly mentioning one of the predicates in each domain of quantification.
Using HE1, HE2, or HE3 as over-hypotheses HE makes analogical reasoning look like an enthymeme, an inference that lacks a premise (according to the modern meaning of a concept created by Aristotle with the broader meaning of "deduction from likelihoods and indices," see Boyer, 1995). Alan Musgrave (1989) was one of the first to suggest the transformation of inductive inferences into deductive enthymemes, and in the present context, analogical inferences are of the same kind. But we will see in Section 3.3, that it is not the most general situation and that they are more generally probabilistic or non-monotonic inferences.
Considering its meaning, HE3 is a meta-level "over-hypothesis," which does not rely on the direct relationship between P and Q, but on the relationship between the set of properties Z and the set of properties Z'. Due to the quantification over Z and Z', HE3 is actually a set of hypotheses for each couple of predicates, one in Z and one in Z'. Since Z and Z' are lists of disjointed properties, each object X can satisfy at most one property in Z and at most one in Z'. HE3 means that all the objects satisfying one given property in Z must satisfy one given other property in Z'. In a way, HE3 links each property in Z with (at most) one property in Z'. So, this can be written in a more understandable form as: (11) HE3: for any property P in Z, there exists a property Q in Z' such that [∀X (P (X) → Q(X)] This is exactly the situation that Nelson Goodman (1947) suggested in his chapter "Prospects for a Theory of Projection." Suppose that we are interested in the colours k of the marbles drawn from a bag h, which belongs to a stack of bags. What Goodman calls "over-hypothesis" H of some hypothesis G such as "all the marbles in the bag B are red" is a hypothesis H such as "every bagful of the stack is uniform in colour." Goodman considers the situation where many bags in the stack have been observed (but not the bag itself) and where this observation leads to confirm H. Having H in mind, observing a red marble from the bag B will support G. In this case, Z is the set of predicates "to belong to the bag number h" and Z' is the set of predicates "to be of colour k." The over-hypothesis just states that each bag number h is associated with only one colour k.
For instance, consider a car that is nearly the same price as mine, which is a Chevrolet Silverado. I want to infer that the other car costs nearly the same price than mine. HE2 would state: "every Chevrolet Silverado costs between $28,000 and $32,000 or no Chevrolet Silverado costs between $28,000 and $32,000"; my Chevrolet Silverado costs $30,000; hence, this other Chevrolet Silverado should cost between $28,000 and $32,000. The proposed HE3 over-hypothesis states instead that the cost of any car of a given brand is situated in a range of prices; my own Chevrolet Silverado costs between $28,000 and $32,000; hence, this other Chevrolet Silverado should cost between $28,000 and $32,000. Ranges of price are defined exogenously by the brand of cars.
The hypotheses HE2 and HE3 differ by the empirical protocol necessary to validate them. In order to know HE2, I need to observe all (or a very large number) of Chevrolet Silverado and to observe that all cost between $28,000 and $32,000 (otherwise this would be incompatible with the premise that mine costs around $30,000). But, in this case, it is not HE2 that I would believe in, but instead that "every Chevrolet Silverado costs between $28,000 and $32,000." Moreover, if I don't know that my Chevrolet Silverado costs $30,000, I have no idea of the price range that I have to test. In order to know HE3, I just need to observe that all cars of the same type cost approximately the same price. This observation, which is indeed plausible, would lead to the general hypothesis that, for any car of a given brand, the price is in the same range. We immediately see that, contrary to HE2, the process of acquiring the belief in the hypothesis HE3 is more realistic and it does not lead to a mutilation of the hypothesis in only one part of the alternative.
Indeed, such an over-hypothesis HE3 frequently states a regularity between classes of objects already defined: nationality determines the mother tongue; age and skills determine the class followed by pupils; species, gender, and age determine an animal's weight.
Each one of these examples is an example of HE3: a partition of a set Z of possible properties is related by an application with a partition of another set Z' of possible properties.
This type of belief is usually acquired in a very natural way, as illustrated by the previous Goodman example: one observes that each time that objects satisfy a same property (whatever it is) amongst a first list of properties (a first domain), they satisfy also one other same property from a second list of properties (a second domain). This process is typically inductive in that we infer a general law from a limited (but possibly large) number of cases. But the over-hypothesis could also result from an abductive process since the conclusions may be the generic best explanation of the antecedents.
One may worry that this way of conceiving an analogical inference leads to an infinite regression, since it relies on an over-hypothesis that has itself to be justified. But that is not the case since foundationalism is not the goal here. The way to acquire HE3 is not itself a part of analogical reasoning: what is required is only the fact that it is possible to attribute to HE3 a degree of belief by a clear empirical protocol, without redundancy for the analogical reasoning that it supports. But this degree of belief is exogenous to the present analysis.

The Over-Hypothesis in a Probabilistic Framework
One might avoid complex meta-level hypothesis, such as HE3, simply by weakening the too strong HE1 and giving it a probabilistic status such as: (12) HE4: Pr(Q(X) / P(X)) = α where Pr is an epistemic probability of an agent about HE1 and not an objective probability embedded inside the hypothesis HE1 considered as random.
Using HE = HE4 within our general inference scheme (7) leads to the conclusion that Pr(Q(A)) = α, according to a simple probabilistic inference rule. It validates the inference scheme (7) if ➲ is an entailment relation associated with a degree of probabilistic belief α, assumed to be part of the background belief of the agent. This inference is no more deductive since it is now defeasible: a further fact concerning A or B may change the probability of Q(A) (as stressed for instance by Hempel, 1965). Another way to putting it is to realize that there may be several competing HE4, based upon other properties P'(X), such that the belief in the conclusion changes when P'(X) occurs. But this formula still leads to redundancy: it is not necessary to have any information about B in order to get a probability degree over Q(A). So, it does not do the job better than HE1.
The only way to escape redundancy is to use a meta-level hypothesis such as HE3 (HE2 having been shown to be irrelevant). But in manyif not allempirical situations of science or ordinary reasoning, we do not have such a hypothesis in our background belief but only uncertain versions of it. We can formulate this reasoning by assuming that the agent uses the following kind of hypothesis: (13) HE5: Pr(HE3) = α Using HE = HE5 within (7) leads to the conclusion that Pr(Q(A)) = α. Again, this inference scheme is not deductive, and is not an enthymeme, since it is defeasible by any new information concerning A or B. This feature is the basis that allows for the possibility of analogical debates. In any case, Pr(HE3) depends on all factors included in HE. Globally, four typical different situations will be analyzed.
-Situation 1: Pr(HE3) is so high that HE3 is "accepted" (meaning, for instance, that its probability is close to 1 as formalized by Adam's semantics) (Pearl, 1988). Then the analogical reasoning will get a real strength and the conclusion will be accepted with the same strong degree of belief: my car is a Twingo like yours. Its price is well below $10,000. Hence, yours must also cost less than $10,000 because the model of a car determines its price. The fact that the model of a car determines its price is almost certain so I can be almost sure of the conclusion. -Situation 2: Pr(HE3) can be simply higher than Pr(Q(A)). In this case, the degree of belief in Q(A) increases to the level of the degree of belief in HE3. This corresponds to a relative confirmation (Zwirn & Zwirn, 1996) in which the belief in the conclusion is strengthened while being not great enough to lead one to accept it (absolute confirmation, which corresponds to the previous acceptance situation). The strength of the analogical reasoning in Situation 2 is lower than in Situation 1: Bjorn and Anna are Swedish students at Australian universities. Bjorn speaks English fluently. So, I can believe that Anna also speaks English very well (because their nationality and their levels of study determine fairly accurately their general linguistic competence). A priori, I cannot know if Anna, who is Swedish, speaks English fluently and my degree of belief in that is low. However, I have a fairly high degree of confidence in the fact that nationality and levels of study determine linguistic competence. So, when I learn that Bjorn, who is a Swedish student like Anna, speaks English fluently, I increase my belief in the claim that Anna does too. -Situation 3: Pr(HE3) can be totally unknown: no degree of belief is attached to HE3. In this case, the analogical reasoning will have no proof value but may have a purely heuristic interest in noticing a new possibility worthy of being explored: Mars and Earth rotate around the sun not too far from it; they have similar gravity and surface temperatures. There is life on Earth. Hence, perhaps there is life on Mars (because rotating not too far around the sun, and having gravity and a surface temperature similar to those of the earth are circumstances in which life appears). Clearly, the conditions that are necessary and sufficient for life to appear are not yet known. So, the over-hypothesis here is very risky and even probably false. Nonetheless, it seems worth exploring whether Mars can harbour life rather than exploring whether Pluto can, since the latter is totally different from Earth. -Situation 4: Pr(HE3) is known to be very low or close to 0. In this case, the conclusion of the analogical reasoning is not taken seriously and is even considered to be silly. This corresponds to cases where the analogical reasoning is considered as a bad reasoning mode: the tiger has a tail, as does my cat. My cat is kind, hence the tiger is kind (because the tail induces the behaviour).
There may be many intermediate situations but the general principle is the same: the degree of belief in Q(A) is determined by the degree of belief in HE3. Ideally, rational agents will consider all available evidence available when forming their belief in HE3, which will lead directly or by discussion with other rational agents to attribute to the over-hypothesis a probability that already anticipates the possible counter-analogies.
Of course, HE3 is not always made explicit by the agent in her analogical reasoning. But even if not, a hypothesis of type HE3 must always be in the background belief of the agent when following such a reasoning scheme with some consistency. It can be "revealed," by another agent who observes the other hypothesis and the result of the analogical inference. For instance, an adversary reveals this hypothesis just in order to show the weakness of the reasoning: "to infer that, you need this background overhypothesis, yet it is clearly absurd or it has a very low probability." Moreover, because it always relies on relative analogies, reasoning by analogy is open to revision. Especially, the over-hypothesis can be modified in its structure as well as in its degree of belief. For instance, a counter-analogy, true for A and B in a domain Z*, different from Z, may relativize the conclusion of a first analogical inference, even if the belief in the over-hypothesis is high, insofar as this counter-analogy could be associated with another over-hypothesis, which leads to an exception to the first one and lessens the belief in it. This is the case with the over-hypothesis, "nationality determines the mother tongue," which is refuted by immigrants who have changed their nationality. The reason here is that some causes ("hidden factors") necessary to explain the mother tongue were ignored.
Finally, one may also use analogical inference as a reductio ad absurdum reasoning. Let's consider a situation where the premises are: P ∊ Z, Q ∊ Z', P(A), P(B), Q(B), and where one has a degree of belief β that Q(A) is false. One ought to have the same degree of belief that the properties P within the domain Z do not determine the properties Q within the domain Z'. Indeed, in this case, if HE3 were true, then Q (A) could not be false (because it is indeed deductive). If we have some degree of belief that Q (A) is false, we therefore must have the same degree of belief that HE3 is false.
To conclude about the over-hypothesis, determination rules state in a formal and unified way very vague intuitions on analogical reasoning such as, for example, the "relevance" of the analogy on which the reasoning is based. It also points to the fact that it is mandatory to use the properties of B in order to come to conclusions about A (non-redundancy).
However if Davies and Russell point to this idea, they fail to analyze precisely why there is an important difference between an over-hypothesis like HE2 and an overhypothesis like HE3. Indeed, as we have shown, there is no clear empirical protocol that would lead to belief in a hypothesis like HE2. Their definition of a determination rule (Davies & Russell, 1987, p. 265) is clearly equivalent to HE2 and is not satisfying.
Moreover, Davies and Russell stick only to the deductive version of the determination rules, making analogical reasoning a purely deductive process. But they miss the fact that in most if not all empirical contexts, HE3 is uncertain and that one must use a probabilistic version of the over-hypothesis such as HE5 (or any other representation of uncertainty relevant in the context). For this reason, analogical reasoning is defeasible. More generally, their analysis presents analogical reasoning as one single deductive inference scheme, while ours presents it as a general form of inference scheme, which may be validated within several logical frameworks.
Bartha criticizes the Davies and Russell's determination rules with the following argument: "Scientific analogies are commonly applied to problems where we do not possess useful determination rules. In many (perhaps most) cases, researchers are not aware of all relevant factors" (Bartha, 2010, p. 47). But his argument leads to a confusion about the role played by these rules: even when an agent does not know them, or is uncertain about them, they always play a normative role in evaluating the strength of the analogical argument, by attributing degrees of belief to them or by showing that one cannot attribute any degree of belief to them as a counter-argument.

Comparison with Single-Case Induction
We first formalize the main reasoning modes in our general framework. Let H be a universal hypothesis such that ∀X, [P(X)→ Q(X)]. The following schemes hold for A, B, C ∈ X: In "single-case induction" (the expression is used by Bartha, 2010, passim), the hypothesis is confirmed by a single positive confirmation instance. By combination of single-case induction and deduction, one obtains: for A, B ϵ X, [P(A), P(B), Q (B)]➥Q(A), which is precisely the raw form of reasoning by analogy. Hence, singlecase induction may go from particular to general but may also be applied to one single instance, with exactly the same syntactical form as analogical reasoning. Abduction is a different scheme, in which the inference goes in the reverse direction from H and Q(A).
Analogical reasoning and single-case induction transfer a property from one object to another in exactly the same way, i.e., relying on the fact that both objects already share another property. A property Q observed for an object of "type P" is transferred to another object of "type P," as in the standard example: this raven in front of me is black, hence the next raven I'll see should be black (an instance of "all ravens are black"). That does not mean that reasoning by analogy is reducible to a kind of induction but rather that each individual step of any enumerative induction consists in a reasoning by analogy. Single-case induction is nothing else than a kind of analogical reasoning, but there are different ways to express analogical reasoning in the usual language due to different pragmatic contexts.
In single-case induction, objects of the analogical statement are usually directly designed by the property they share according to this statement, preceded by a demonstrative pronoun of time, place, ownership, etc.: -If this P is Q, then this other P is also Q For instance, induction implies that since my canary is yellow, yours should be yellow too.
In other kinds of reasoning by analogy (the kinds that are the most popular in the literature concerning analogy because they seem different from classical examples of single-case induction), objects are primarily designed by their names, or by any designator independent of the properties they are supposed to share according to the analogical statement, which are attributed to them in a sentence: -If A is P, B is P, B is Q, then A is Q For instance, if a canary has wings and is able to fly, then an airplane that has wings should also be able to fly.
In single-case induction, the property P shared by two objects is used as the "principal designator" (to be a canary). In reasoning by analogy, the shared property is used only as a secondary qualifier; it is explicitly mentioned as the shared property of two objects, which are designated through another qualifier (to have wings for a canary). Common properties used in single-case induction are pre-existing: they correspond often to a class inside a current taxonomy. These taxonomies and these classes have been built inside the language precisely because they are well suited for maximizing causal effects with other properties, hence to build over-hypotheses of type HE. Common properties used in other analogical reasoning are usually selected at the moment when the reasoning is made. They are not used as current principal designators of any class in a usual taxonomy, and can be unusual. This helps to understand why single-case induction is often considered to be more reliable than other analogical reasoning. The basic idea is that current categories are defined in a way that exactly takes into account the relation of determination between many properties that define each category. For example, the property of being a canary has a lot of other consequences. This is why this category is useful. Hypotheses of type HE3 or HE5 linked to this kind of category are enough to help draw inferences from the fact of knowing that something is a canary.
In a nutshell, single-case induction is nothing else than reasoning by analogy but in a normalized context where it relies on over-hypotheses that rely on the properties that are used as principal designators for the objects that are considered. These overhypotheses are well entrenched because they rely on categories currently used inside the language. However, over-hypotheses used in other cases of analogical reasoning rely on properties that are less general and hence less entrenched. Both reasoning modes are syntactically the same but they are used in different pragmatic contexts.
Of course, that does not mean that single-case induction cannot be unusual too: since this stone is small, this other stone is also small. The fact that the designator is a current category does not imply that all over-hypotheses that could rely on it are relevant. The fact that something is a stone bears no relationship to its size.
Finally, the role of principal designator can be contextual. Consider, for example, as principal designator "being a New Yorker," to whom we can attribute some secondary properties such as "wearing purple shorts." In general, the first property is used as a basis for making typical inductions (this New Yorker is a runner, and so this other New Yorker is too) and interest for the second property will be found only in reasoning by analogy in which one will designate the concerned individuals by their names (Paul is wearing purple shorts). But if, in a basketball competition, one sees individuals wearing either purple shorts or white shorts, this property may become contextually a principal designator: as it is learned that this purple shorts wearer belongs to team A, it will be inferred that this other purple shorts wearer also belongs to team A. These contextual situations do not contradict the preceding remarks: the particular context adds data to the situation, making usually subsidiary properties become relevant as principal designators; these properties may then be used in this context as meta-hypotheses of type HE3 or HE5 (for instance, the team to which an athlete in a competition belongs determines the colour of the shorts that he is wearing).

Conclusion
In the foregoing, we have considered a traditional scheme of reasoning by analogy that is found in two steps. Analogical assessment suggests that some objects are similar as concerns a fixed set of properties. Analogical inference suggests that a new property possessed by these similar objects can be transferred to another object already similar to the considered ones. Our conceptual scheme differs from the existing research in several respects.
Firstly, we suggest that an analogical statement is not true or false, good or bad in an absolute way, but is relative to some point of view expressed by a domain of properties. If some debate is raised about it and makes it defeasible, it concerns rival analogical inferences and not analogical assertions by themselves.
Secondly, we put forward that analogical inference is defined with respect to a background over-hypothesis supporting it, which is a hypothesis that links the properties at some upper level. This avoids trivializing the reasoning and explains in a coherent way the possiblethough multipleorigins of background beliefs.
Hence, our analysis reconciles the opposite ideas that analogical reasoning is a useful method, especially in science, and a fanciful method, especially in current reasoning. It may be both, in all circumstances, while using always the same inference scheme: the plausibility of the conclusion depends only on the plausibility of the background over-hypothesis used in this scheme. So we refute the idea that performing good or bad analogical reasoning depends on the fact of relying or not on good analogical statements.
Finally, our analysis departs from those who consider analogical reasoning to be a specific reasoning mode that is different from deduction and induction. On the contrary, it relies on the idea that it is a general inference scheme that can be based on different types of logic, deductive, probabilistic, or non-monotonic.
Further research can be done in some theoretical directions. As to concerns of analogical assessment, the formal definition of relative analogy may be formally extended to "structural analogy," which would generalize it to whole sets of analytical relations between several properties treated as variables within a "model." It would then allow our definition to deal with more complex scientific examples of analogy. As to concerns of analogical inference, one could analyze more closely the properties of the over-hypothesis that make it more probable and examine some non-monotonic frameworks that express the analogical inference scheme. It would also be interesting to study the relationships of the present analysis with artificial intelligence research on analogy and potentially close concepts, such as case-based reasoning.
Further investigations can also be done in empirical domains. On the one hand, one would do well to consider historical examples of how analogy is used in the process of science, in combination with other reasoning modes, especially in the "context of discovery" as opposed to the "context of proof." On the other hand, one could examine how scientific and popular analogies are refuted and revised over time, under pressure of new information and debate, until becoming widely accepted, or definitively discarded.