Polyhedral geometry and combinatorics of an autocatalytic ecosystem

Developing a mathematical understanding of autocatalysis in reaction networks has both theoretical and practical implications. We review definitions of autocatalytic networks and prove some properties for minimal autocatalytic subnetworks (MASs). We show that it is possible to classify MASs in equivalence classes, and develop mathematical results about their behavior. We also provide linear-programming algorithms to exhaustively enumerate them and a scheme to visualize their polyhedral geometry and combinatorics. We then define cluster chemical reaction networks, a framework for coarse-graining real chemical reactions with positive integer conservation laws. We find that the size of the list of minimal autocatalytic subnetworks in a maximally connected cluster chemical reaction network with one conservation law grows exponentially in the number of species. We end our discussion with open questions concerning an ecosystem of autocatalytic subnetworks and multidisciplinary opportunities for future investigation.

Chemical reaction network (CRN) theory offers a versatile mathematical framework in which to model complex systems, ranging from biochemistry and game theory [1], to the origins of life [2].The usefulness of CRNs in modelling these phenomena stems from its ability to exhibit a wide range of nonlinear dynamics and it is widely Figure 1: Ecosystems obtained by composing autocatalytic cycles.Each autocatalytic cycle consumes food and produce waste species.Two cycles can be composed in several ways, for example by sharing food (line one), the member species of one could be the food of other (line two), or the waste of one can deplete the food of another (line three).For a detailed account of such interactions and their ecological counterparts, see [7].
recognized that autocatalysis can be seen as a basis for many of them [3,4].Broadly, autocatalysis is framed as the ability of a given chemical species to make more copies of itself or otherwise promote its own production.Thus, from the perspective of kinetics, one would expect autocatalysts to be able to show super-linear growth, which is indeed possible [5,6].For some examples of nonlinear kinetics obtained by composing two autocatalytic cycles and their relevance to ecological dynamics, see Fig. 1.
A comprehensive understanding of autocatalysis in CRNs has great practical values, but its full mathematical treatment remains to be developed.In [6], Anderson et al. conjectured that it is impossible for a CRN to show a temporary speed-up of the reaction before settling down to reach an equilibrium if, at least, it is not formally autocatalytic.Recently, in [8], Vassena et al. show that this is not the case and there can be systems that show superlinear growth that do not exhibit autocatalysis.Also, it is suggested in [9] that understanding the interactions between autocatalytic and drainable subnetworks might provide a basis for obtaining a proof for the long-standing persistence conjecture [10].
Andersen et al. also defined a notion of exclusive autocatalysis for one species, which we generalize for multiple species.In [11], Blokhuis et al. defined a notion of auto-catalysis, more restrictive than exclusive autocatalysis, and showed that, within reversible and non-ambiguous CRNS, it has only five types of minimal forms.They also found that autocatalysis is more abundant than previously thought, and simple reaction networks with a few reactions can contain very many cores in them.
In this work, we focus our attention on the topological notion of autocatalysis, one that can be ascertained purely from the reactions in a CRN.A CRN is autocatalytic in a set of core (autocatalytic) species if there exists a flow on the network such that all core species are necessarily being consumed while being produced at higher rate.We term the subnetwork that consists of all the reactions that have a strictly positive contribution to such a flow as an autocatalytic subnetwork.By imposing further constraints on the subnetworks, we resolve three nested categories of autocatalysis: formal, exclusive, and stoichiometric autocatalysis.We show that stoichiometric autocatalysis is similar to the autocatalysis of Blokhuis et al. (see Remark 11), and rederive some of their results for the broader class of exclusive autocatalysis.In particular, we prove that minimal exclusively autocatalytic subnetworks have the property that along with all core species being produced at a higher rate than they are consumed, at least one core species is produced and consumed in each reaction of the subnetwork.We then investigate the polyhedral geometry of these minimal autocatalytic subnetworks (MASs), provide algorithms for exhaustively enumerating them, and propose visualization schemes for understanding their combinatorics.
We employ the techniques developed in this work on a framework, which we term the Cluster CRN (CCRN).A brief description of the motivation and formalism of the CCRN framework is as follows.Molecules are composed of atoms and atomic conservation is a fundamental law of chemistry.While molecular structure plays a critical role in determining what atomic interchanges are possible in particular environments (and their rate constants), focusing on just atomic composition provides a simple abstraction that, at some level, must resemble real chemistry.In the most basic version with a single conserved quantity (such as a monomer or an atom), the list of all potential reactions is dictated only by arithmetic.By adding other dimensions it is theoretically possible to represent a lot of real chemistry.Adding dimensions can allow for polymers composed of multiple monomer types or for molecular formulae composed of more than one element, where the dimension is the number of monomer types or elements, respectively.Other dimensions can also be used to represent structural variation within a molecular formula or spatial variation, although doing so requires adding constraints besides arithmetic on allowed reaction rules.We thus posit that the CCRN framework is a very flexible coarse-graining of chemistry.Formally, a CCRN is a CRN with positive integer valued conservation laws.The investigations undertaken in this work show that even the simplest version CCRN (in one di-mension) can reveal important principles regarding the abundance of MASs and their interactions.
The layout of the paper is as follows.In Sec.II, we briefly review chemical reaction networks (CRNs) and present a hierarchy of autocatalysis in their graphtheoretic and linear-algebraic manifestations.In Sec.III, we define MASs, prove their properties, organize them in equivalence classes, and explore some geometric quantities that are canonically defined for them.In Sec.IV, we consider CRNs with multiple MASs, which we term as an autocatalytic ecosystem.Given such a CRN, we provide a linear-programming algorithm to exhaustively enumerate the MASs and introduce a visualization scheme in which to represent their polyhedral geometry their combinatorics.In Sec.V, we present the cluster chemical reaction network (CCRN) framework and argue that it provides a natural coarse-graining of realistic chemical and biochemical reaction networks.We then explore the combinatorics of MASs within the CCRN framework and provide a worked example to demonstrate the geometry of autocatalysis in the stoichiometric subspace of species concentration.Using elementary number theoretic considerations, we also prove a combinatorial result about the list of MASs.Multiple examples from the CCRN framework are also included as examples at several points in this work.We conclude, in Sec.VI, with an overview of our contributions and possible avenues for future investigation.

II. CHEMICAL REACTION NETWORKS AND AUTOCATALYTIC SUBNETWORKS
The mathematical theory of chemical reaction networks (CRNs), pioneered by Horn, Jackson, and Feinberg [12] has ubiquitous applications in understanding nature [13,14].In particular, for diverse biological and chemical applications, the formalism allows for a notion of self-replication or autocatalysis where certain species cause an increase in the population of these same species.Consequently, there is value in rigorously understanding the organization of CRNs containing a collection of autocatalytic subnetworks, which we term as an autocatalytic ecosystem.
The concept of autocatalysis, dating back to the late 1800s [15,16], has several distinct formulations in CRN theory.In this section, we explain the correspondence between graph-theoretic and linear-algebraic formulations of CRNs, and employ it to present a nested hierarchy of three types of autocatalysis: formal, exclusive, and stoichiometric.In particular, we provide conditions on a subset of reactions (subnetwork) of a CRN for them to be termed autocatalytic for some subset of autocatalytic species.Note that, for a subnetwork to be autocatalytic, we require that all the reactions participate with a strictly positive flux.So, while a CRN can be said to be autocatalytic if it contains an autocatalytic subnetwork, in our formulation, a subnetwork might loose its autocat-alytic property if more reactions are added.This choice of definitions is made to simplify analysis (see Remark 11) and leads the way to an investigation of minimal autocatalytic subnetworks in the next section.

A. Preliminaries
Notation II.1.0 denotes a vector of zeros.
2. We call v non-negative and write 3. We call v semi-positive and write v > 0 if v i ≥ 0 and v ̸ = 0.
Notation II.3.For vectors v and w in R n , v ≫ w, v ≥ w, and v > w mean v − w ≫ 0, v − w ≥ 0, and v − w > 0, respectively.
2. Their span is denoted by Notation II.4.For a matrix M, we denote: 1. the set rows of M as rows(M).
2. the set of columns of M as cols(M).
3. the i th row as (M) i .
4. the j th column as (M) j .
5. the entry in the i th row and j th column as (M) j i .Notation II.5 (Restriction).Let M denote a m × n matrix with m rows and n columns.Let M ⊆ {1, . . ., m} and N ⊆ {1, . . ., n} be a subset of rows and columns, respectively.Then the restriction of M to the rows of M and columns of N is denoted as M M and M N , respectively.The simultaneous restriction of M to both sets M and N is denoted as Notation II.6 (Support).The support of a vector v, denoted supp(v), is the set of its non-zero coordinates.

B. Chemical reaction networks
Definition II.2.A chemical reaction network (CRN) G is defined as the triple of a species set, complex set, and reaction set (see [13]) where: 1. the species set S, of size |S|, consists of all the species that appear in the network, 2. the complex set C consists of complexes of the network.A complex is a formal non-negative integer linear combination of elements of S. The coefficients of the species in a complex form a vector in and denote the stoichiometry of the complex.
A complex y ∈ C will be taken to mean either of the following: and be denoted by a column vector.
3. the reaction set R ⊆ C × C consists of reactions of the network.A reaction is an ordered pair of complexes, and has the form or, equivalently Here r − , r + ∈ C and are referred to as the input and output complex, respectively.
Remark 1.Since the information of the set of complexes C is implicit in the set of reactions R, Deshpande et al. in [9] economically define a CRN simply by the pair G = (S, R).Here the complex set is given by taking the union of all the input and output complexes for all the reactions, Remark 2. Isomorphically, a CRN G = (S, C, R) can be represented as a directed multi-hypergraph [17,18] whose hyper-vertices are multisets of vertices according to the dictionary: CRN Hypergraph S Species Vertices C Complexes Hyper-Vertices R Reactions Hyper-Edges C. Input-output matrix pair and the stoichiometric matrix Recall that every reaction is of the form Input complex → Output complex.
Notation II.7.We denote a reaction r ∈ R as where r − , r + are column vectors in For any CRN, we can define S × R matrices S − and S + by collecting all the input and output complexes, respectively, where We refer to S − and S + as the input and output matrix, respectively.Observe that S − and S + have only nonnegative entries.
Lemma II.1.Each CRN G has a unique representation as where S denotes the species set and (S − , S + ) represent the input-output matrix pair.
Proof.Using Remark 1, a CRN G is given by the pair of species and reactions S, R. By definition, a reaction r = r − → r + is in the set R if and only if (iff) the r th columns of S + and S − are r + and r − , respectively.
Remark 3. Definition II.2 and the following remarks provide a graph-theoretic (GT) representation of CRNs, and Lemma II.1 provides a linear-algebraic (LA) representation of CRNs.
Definition II.3 (Stoichiometric matrix).The stoichiometric matrix S is a S×R matrix defined as the difference of the output and input matrices, Alternatively, for r ∈ R and using Eq. 2, each column of the stoichiometric matrix is given by the difference of the output and input vectors of a reaction, Definition II.4 (Stoichiometric subspace).The span of the columns or the image of the stoichiometric matrix is termed as the stoichiometric subspace.

D. Hypergraph flows and conservation laws
Definition II.5 (Flow).For a CRN G = (S, R), a column vector v ∈ R |R| ≥0 will be said to define a (hypergraph) flow on the CRN.
A flow on a graph can also be used to create a linear combination of the reactions, also called a composite reaction in [6].While, technically, an abuse of notation, let use use r to index the reaction set R and also refer to the r th reaction.Then, a flow denote the population of each species in the system, and v ∈ Z R ≥0 denote a flow on G. 1 The amount of species consumed and produced due to this flow are given by S − • v and S + • v, respectively.Thus, the change of species population ∆n resulting from the flow v on the graph G is given by Note that, from Definition II.4, the change in species population under an arbitrary flow on G lies in the stoichiometric subspace.
The right null space of S correspond to null flows which leave the population of the species unchanged (∆n = 0).An accounting of the right null space of S provides a topological classification for CRNs, which we review in Appendix A.
Definition II.6 (Null flows).A column vector w ∈ R |R| is said to be a null flow on the CRN G if The left null space of S, i.e. vectors x ∈ R |S| such that x • S = 0, correspond to conservation laws Equivalently, the inner product of the conservation law with the population x • n yields a conserved quantity that is an invariant for the network under an arbitrary flow.
Definition II.7 (Conservation law).For a CRN G = (S, R), a column vector x of size |S| will be said to be a conservation law if 1 We will use a notation of counts for definiteness and simplicity.A parallel construction goes through with concentrations and flow vectors ∈ R |S| ≥0 and can be found in [19].
The vector x will be called a positive conservation law or mass-like conservation law if all its entries are nonnegative.Moreover, if the entries are nonnegative integers, it will be called a positive integer valued conservation law.
Example II.1.Consider the CRN G = (S, R) given as For a graphical representation of this example, see Fig. 2. Then the species set S = {A, B} and the reaction set As can be read from R, the complex set C = {A + B, 2A, 2B}, and their corresponding vectors are The input and output matrices, S − and S + , respectively, for G are with left null space (conservation laws) spanned by x = [1, 1] and right null space (null flows) spanned by the basis {[1, 1, 0] T , [1, 0, 1] T }.Notice that the stoichiometric subspace is one-dimensional and the Euclidean innerproduct of the spanning vectors with the conservation law is zero.

E. Formal, exclusive and stoichiometric autocatalysis
Consider a CRN G = (S, R) with its stoichiometric matrix S.

2.
Every species that appears in the subset R ′ in G appears in S ′ .Equivalently, Remark 4. The stoichiometric matrix of the subnetwork is obtained from that of the network by keeping only the columns and rows corresponding to the reactions and species participating in the subnetwork, respectively.
Definition II.9 (Motif).A CRN G ′ = (S ′ , R ′ ) is a motif (or a subhypergraph) of G if (see [11]): 1.The reaction set of G ′ is a subset of the reaction set of G, R ′ ⊆ R.

2.
The species set of G ′ is a subset of the species set of G, S ′ ⊆ S.
Notation II.8.The stoichiometric matrix of a motif is obtained from that of the network by keeping only the rows and columns corresponding the species and reactions in the motif, respectively.Henceforth, we denote an arbitrary submatrix of the stoichiometric matrix corresponding to the motif (S ′ , R ′ ) with an overbar,

Formal autocatalysis
Definition II.10.R) is defined to be formally autocatalytic in the subset M of S ′ if (see [6,21]): 1.There exists a positive real flow (≫ 0) on G ′ such that the resulting composite reaction is of the form where 0 ≪ m ≪ n.Here m and n are the stoichiometries of the set M in the input and output of the composite reaction, respectively, and Remark 5.This definition can be generalized to semipositive flows, but we will not need that generalization for this work.(Also, see Remark 11.) ) is formally autocatalytic in the subset M iff: 1.There exists a flow v ≫ 0 on G ′ such that 2. All species in M are consumed in at least some reaction in R ′ , or Proof.Let the flow v ≫ 0 on the subnetwork G ′ result in a composite reaction of the form (F) + mM → nM + (W).
Condition 1 then yields that n − m ≫ 0, and condition 2 implies that m ≫ 0.
Thus, the conditions stated in the lemma imply that the subnetwork is formally autocatalytic.Conversely, suppose there is a flow v that leads to the composite reaction of the form and Remark 6.A matrix S = S| M for which there exists a v ≫ 0 such that is also referred to as a productive matrix in [11, Supplementary Information] and is known as a semi-positive matrix [22,Theorem 3.1.13].Notice that, productivity implies rows((S + ) M ) > 0.
Remark 7. The productivity condition can be restated as the condition that the interior of the cone induced by the columns of the stoichiometric matrix intersects the positive orthant in the autocatalytic species, interior(cone(cols((S)))) ∩ R M ≥0 ̸ = ∅.Remark 8.The pair (M, R ′ ) defines an autocatalytic motif.
Example II.2.Consider the reaction network G f = ({A, B}, {A → B, A + B → B}).Under the flow v = [1, 1] T , the resulting composite reaction is Correspondingly, the input and stoichiometric matrix for the CRN are respectively.It is easy to check that both requirements of Lemma II.2 hold true in the species set {B}.Thus the reaction network is formally autocatalytic for the set M = {B}.

Exclusive autocatalysis
Definition II.11 (GT).A formally autocatalytic subnetwork G ′ = (S ′ , R ′ ) of CRN G = (S, R) is defined to be exclusively autocatalytic in the subset M (see [6,9,23,24]) if: • Every reaction in R ′ consumes at least one species from the set M. This ensures that the flow is inadmissible, or there is no flow, if the population of any species in the set M is zero.
) is exclusively autocatalytic in the subset M if and only if: • The restriction of the input complexes to the set M is greater than zero, or Proof.It is clear that if the input complex r − > 0 for each r − ∈ cols((S − ) M ), then each reaction consumes at least one species from the set M, and vice versa.
Remark 9. If, for a reaction r − → r + in the exclusively autocatalytic subnetwork, the output complex has no species in the set M, then such a reaction can be removed from the subnetwork to yield a smaller exclusively autocatalytic subnetwork.Thus, 'smaller' (not necessarily minimal) autocatalytic subnetworks can be obtained by further imposing the condition: Remark 10.For a reaction network, a species set M that satisfies cols((S + ) M > 0 and cols((S − ) M > 0 is termed as a siphon by Deshpande et al. in [9].Moreover, if the network is also productive, the set M is termed as a self-replicating siphon.
Example II.3.One can verify that the CRN G in Example II.1 is: 1. Exclusively autocatalytic in the set M = {A}.
2.Not exclusively autocatalytic in the sets M = {B} or {A, B}.
Similarly, one can verify that the CRN G f in Example II.2 is not exclusively autocatalytic since the flow proceeds even if the population of B is 0.
2. For every reaction r − → r + ∈ R ′ : (a) each of r − and r + contain at least one species from the set M.
Lemma II.4.An exclusively autocatalytic subnetwork ) is stoichiometrically autocatalytic in the subset M only if: 1.Each column in S = (S + − S − ) M has at least one positive and one negative entry.
2. Each row in S has at least one positive and one negative entry.
Proof.The conditions in Definition II.12 implies that every column in the stoichiometric matrix S + − S − restricted to the autocatalytic set contains a positive and a negative entry.From condition 2 in Lemma II.2 and Remark 6, we know that every row in the input and output matrix pair restricted to the autocatalytic set is semi-positive, respectively.This, along with the condition that every reaction has distinct input and output species, yields that every row in the stoichiometric matrix restricted to the autocatalytic set must contain a positive and a negative entry.
Remark 11.The preceding lemma alludes to the equivalence of our notion of stoichiometric autocatalysis and the autocatalysis of Blokhuis et al. in [11].They term the first condition as autonomy of the submatrix S. The second condition implies that each autocatalytic species is consumed (and produced).However, there is a caveat.The productivity condition for Blokhuis et al. is defined over an arbitrary flow vector which can take negative values.In their treatment, they only consider reversible reaction networks where for a reaction with a negative flow, an opposite reaction with a positive flow exists.Since, we do not make such assumptions on our reaction network, we restrict our flows to be strictly in the positive orthant.
The condition cols((S − ) M ) > 0, ensures that the production of any species consumes at least one other species of the motif, disallowing unconditional growth.Notice that this condition is satisfied by exclusive and stoichiometrically autocatalytic subnetworks.However, the condition that the supports of the input and output complexes are disjoint makes the definition of stoichiometric autocatalysis more restrictive than exclusive autocatalysis.A useful feature of stoichiometric autocatalysis is that it can be inferred from the stoichiometric matrix without referring to the underlying CRN, due to which their criterion is termed stoichiometric autocatalysis [25].
Following the terminology in [9], if there exists a nonnegative flow such that S • v ≪ 0, we would term the motif (M, R ′ ) a drainable motif, where the change in concentration of all the species in the subset M is strictly negative.While we focus our attention on autocatalytic motifs, our results are easily extended to their drainable counterparts.In [9], the authors prove that a network is persistent if it has no drainable motifs and also remark on the importance of autocatalytic and drainable motifs for understanding the dynamics of a network in ecological terms.

III. MINIMAL AUTOCATALYTIC SUBNETWORKS
In the previous section, we gave the criteria for ascertaining when a subnetwork is exclusively autocatalytic.However, one autocatalytic subnetwork can have several subnetworks which are also autocatalytic.In Sec.III A, we define and prove some properties of minimal autocatalytic subnetworks (MASs).In Sec.III B, we demonstrate that these subnetworks organize within equivalence classes.Finally, in Sec.III C we investigate how the subnetworks within an equivalence class can be organized by analyzing what we define as their flow-productive, species-productive and partition-productive cones.

A. Properties
To define a MAS, we first define an autocatalytic core [11].Recall from Definition II.9 that a motif is an arbitrary subhypergraph which is obtained by selecting a subset of the reactions and species in the original CRN.
Definition III.1 (Autocatalytic core).An autocatalytic core is an exclusively autocatalytic motif that does not contain a smaller exclusively autocatalytic motif.
Remark 12.Note that our definition is a generalization of an autocatalytic core as defined by Blokhuis et al. as ours is defined for exclusively, as opposed to stoichiometrically, autocatalytic motifs.
Notation III.1.We denote the motif of the autocatalytic core A = (A C , R A ), its input-output matrix pair as (S + , S − ), and term A C as the core set.
Theorem III.1.A motif (A C , S + , S − ) is an autocatalytic core only if all the following conditions are satisfied: 2. Every species in the core set is consumed at least once, rows(S − ) > 0.
3. Every species in the core set is produced at least once, rows(S + ) > 0.
4. Every reaction in the motif consumes some species in the core set, cols(S − ) > 0.
5. Every reaction in the motif produces some species in the core set, cols(S + ) > 0.
Proof.First, notice that every species in the autocatalytic core must be in the autocatalytic set, otherwise it could be removed to yield a smaller autocatalytic motif.Using Lemma II.3, properties 1, 2, and 4 follow.As remarked in 6, property 3 follows from property 1.Finally, as remarked in 9, if a reaction does not produce species in the autocatalytic set, it could be removed without affecting productivity.
Furthermore, we can easily extend the proof of Proposition 2 in the Supplementary Information for [11] by Blokhuis et al. to show that (exclusively) autocatalytic cores are always square and invertible.
Lemma III.2.Every species in the core set is the only reactant of at least one reaction in the autocatalytic core.
Proof.Let there be a species in the core set that is never the only reactant of any reaction and always occurs as a co-reactant.Since this species is in an autocatalytic core, it must be produced by some reactions.This species occurs either as: 1. an only product of reactions.

a co-product of reactions.
If the species under consideration occurs as an only product of any reaction, then this reaction can be removed to yield a smaller motif which is autocatalytic in the remaining species.If the species only occurs as a co-product of a reaction alongside other species, then the species under consideration can be removed to yield a smaller autocatalytic motif.In either case, minimality of the autocatalytic core is violated, leading to a contradiction.
Recall from Remark 6 that the stoichiometric matrix of an exclusively autocatalytic motif is semi-positive.A semi-positive matrix is minimal semi-positive [22, Section 3.5] if no column can be removed resulting in a semipositive matrix.
Lemma III.3.The stoichiometric matrix of an autocatalytic core is a minimal semi-positive matrix.
Proof.Removing a column of the stoichiometric matrix corresponds to removing a reaction from the motif.
Suppose for the sake of contradiction that removing a column of the stoichiometric matrix of an autocatalytic core results in a motif with semi-positive stoichiometric matrix.
Then removing the species not consumed by any of the remaining reactions results in a smaller motif that also has a semi-positive stoichiometric matrix, and in which every species is consumed by some reaction.Moreover, any reaction consuming a species in the original motif still consumes a species in the smaller motif, so the resulting motif is exclusively autocatalytic, contradicting minimality of the core.
Theorem III.4.The stoichiometric matrix of an autocatalytic core is a square and invertible matrix.In particular, the number of reactions in autocatalytic core equals the number of species in the core set.
Proof.By Lemma III.3, the stoichiometric matrix of an autocatalytic core is a minimal semi-positive matrix, and so has linearly independent columns whose number is bounded by the number of rows [22,Theorem 3.5.6].Lemma III.2 shows the number of species, i.e. rows, is bounded by the number of reactions, i.e. columns.Thus the stoichiometric matrix is square with linearly independent columns, and hence also invertible.
Remark 13.The autocatalytic cores of [11] are autocatalytic cores in our sense that are stoichiometrically autocatalytic and satisfy the additional property that no smaller motif becomes autocatalytic after reversing any of its reactions.Blokhuis et al. provide a graphical characterization of the stoichiometrically autocatalytic cores and show that they can only be of five types.In Fig. 3 we have listed all the minimal autocatalytic subnetworks for the complete 1-constituent CCRN (up to a maximum cluster size of 4) and have color coded them by their types.
In a CRN, an autocatalytic core will generally occur embedded in a set of reactions.From Definition II.8, recall that a subnetwork is obtained from a CRN by selecting a subset of reactions and retaining all the species that participate in the subset.We define minimal autocatalytic subnetworks as the following.

Definition III.2 (MAS).
A minimal autocatalytic subnetwork (MAS) is defined to be the subnetwork with the least number of reactions containing a particular autocatalytic core.
Remark 14.For a particular autocatalytic core, a MAS loses its autocatalytic property over all the core species if any reaction is removed.Moreover, if any reaction is added, the subnetwork is not minimal.Thus, a MAS contains exactly the number of reactions as its particular core.The core types are also color coded using the typology of Blokhuis et al. in [11].In a CCRN, each species is denoted by a vector of positive integers with an overbar, and each reaction satisfies certain conservation laws.For a definition of a CCRN, see Sec.V.The list of reactions can be found in Appendix E Table III.
Remark 15.A MAS can also contain two distinct but overlapping autocatalytic cores.A simple example of such a network is: Observe that, while G is a MAS, it contains two autocatalytic cores consisting of autocatalytic species sets {A, C} and {B, C}.
Remark 16.Observe that, using Theorem III.1, an inflow reaction of the type or an outflow reaction of the type can never be in the reaction set of a MAS.We return to this point in Sec.IV A.
Example III.1.Consider the CRN G given by the reaction set The input-output matrix pair and the stoichiometric matrix for this CRN are given by: It can be verified that this CRN is exclusively autocatalytic but not stoichiometrically autocatalytic.Note, moreover, consistent with Theorem III.4, the stoichiometric matrix is square and invertible.
Example III.2.Recall Example II.1 with the stoichiometric matrix It can be verified that this CRN is exclusively autocatalytic but not stoichiometrically autocatalytic.To convert this to a stoichiometrically autocatalytic network, we can modify the network by replacing each reaction whose reactant and product complex shares a species with two new reactions with a fictitious distinct intermediate species.This then yields a new network G * given by where C, D are the newly added fictitious species.The stoichiometric matrix for G * is given by Notice that the restriction of the stoichiometric matrix to the species set S ′ = {A, C} and the reactions set , indeed satisfies the properties for an autocatalytic core, and thus G * is indeed autocatalytic with core species {A, C}.A similar construction shows that the network is also autocatalytic in the set {B, D}.(In the typology of Blokhuis et al. both of these cores are of type I.)

B. Food-waste-member-core partition
Let G = (S, R) be a CRN and A = (S A , R A ) be a MAS autocatalytic in the core species A C .From Theorem III.4,we know that the number of core species is the same as the size of R A .In general, however, there will also be disjoint subsets of S A of species that only occur as co-reactants, co-products, and both co-reactants and coproducts in R A .As explained below, given a MAS and an associated autocatalytic core, we can partition the species set into food-waste-member-core (FWMC) sets.

Definition III.3 (FWMC)
. Let the stoichiometric matrix of MAS A = (S A , R A ) with a particular core be denoted by S. Denoting the i th row of a matrix M as (M) i , we define: 1. the species set corresponding to all rows of S with nonpositive coefficients as the food set and denote it by A F , 2. the species set corresponding to all rows of S with nonnegative coefficients as the waste set and denote it by A W , 3. the species set corresponding to all rows of S with both positive and negative coefficients as the member set and denote it by A M .
4. the species set corresponding to all rows of S as the core member set and denote it by Remark 17.A MAS with distinct, but overlapping, autocatalytic cores (for example, see Remark 15), can have a non-unique FWMC partition.
Notation III.2.We will refer to the submatrices of the stoichiometric matrix generated by restricting to the food, waste, member, and core species, as S F , S W , S M , and S C , respectively.
Notation III.3.We denote the exclusion of a subset K from the species set by A /K .
For example, all species in the species set of an autocatalytic subnetwork A except the core member species will be denoted by A /C .Note that the core member set, or simply core set, is a subset of the member set, or A C ⊆ A M .We refer to the species in the member set that are not in the core set as non-core member species and denote their set as A M/C .For an example of the partitioning process on a reaction network where each set is non-empty, see Fig. 4.
We refer to the above partition of the species set for a particular autocatalytic core in a MAS as the foodwaste-member-core (FWMC) partition of the MAS.Notice that equality of the FWMC partitions of multiple autocatalytic cores can serve as the basis for an equivalence relation.Remark 18.In the rest of the work, unless specified otherwise, we will assume that a MAS has a unique autocatalytic core.Under this assumption, each MAS can be uniquely assigned its FWMC partition.
Our quadpartite partition of the species set can be coarse-grained to a bipartite resource-member partition where the resource set is the union of food and waste sets (and member set is the same as above) (see Fig. 4).Our partitioning of the species set by examining the entries of the stoichiometric matrix is cognate with the partitioning done by Avanzini et al. in [26], where they partition the set of species into a resource-member partition.Also, it is a refinement of, the food-waste-member partitioning by Peng et al. in [7], where it is shown that different interactions between cycles can be defined based on which types of species are in common.For example, competition applies when two MASs share a food species or a waste species, mutualism, when the food of one is the waste of another, and predation/parasitism, when a member of one is the food of another (see Fig. 1).
Example III.3.In the network G * from Example III.2, the MASs are: Notice that both these subnetworks have empty waste and non-core member species sets.

C. Flow, species and partition productive cones
Recall from Notation III.2 and Theorem III.4 that S C is invertible.The chemical interpretation of the inverse is that the k th column is a flow vector that increases the k th species by exactly one unit, making it an elementary mode of production.Proof.The stoichiometric matrix is a square minimal semi-positive matrix by Lemma III.3 and Theorem III.4,so its inverse has non-negative entries by [22, Corollary 3.5.8],and the columns are non-zero non-negative so semi-positive.

Flow-productive cone:
Definition III.5.The flow-productive cone, denoted by F, is defined to be the cone generated by the elementary modes of production of an autocatalytic core in a MAS, ( Remark 19.Using Theorem III.5, the inverse of S C never contains negative entries.Thus F(A) lies in the nonnegative orthant.By definition, an element in the interior of the flowproductive cone, the flow productive region, of A corresponds to a flow vector for which A is productive, i.e. the core member species of A are strictly produced.

Species-productive cone:
Definition III.6.The species-productive cone, denoted by P, is defined to be the image of the flow-productive cone of an autocatalytic core in a MAS under its stoichiometric matrix S, Therefore, each vector in the interior of the speciesproductive cone for a MAS, which we refer to as the species-productive region, describes a change in concentration of the participating species for which the subnetwork is productive in the core member species.Remark 20.The species-consumptive cone for the drainable subnetwork A op , which is obtained by reversing the edges of the autocatalytic subnetwork A, is the opposite of the species-productive cone: Partition-productive cone: Consider a MAS with species set S. Let us denote the FWMC partition of S as T = {T F , T W , T M/C , T C }. Let e i be a column vector of size S with 1 at the i th position and 0 elsewhere.We define the set of basis vectors B T for the partition T to be where The dimensions of the embedding space of the flow and species productive cones are the number of reactions and species in the minimal autocatalytic subnetwork, respectively.Notice that the food species B is strictly consumed (nonpositive) and the core species A, C are strictly produced (nonnegative) in the species-productive region.For this example, the partition-productive cone is identical to the species-productive cone.
Thus the basis B T for a partition T spans the negative half-space of each species in the food set, the positive half-space of each species in the waste and core set, and both negative and positive half-spaces of each species in the non-core member set.Let the reaction network have conservation laws {x i }, such that x i • S = 0.
Definition III.7.The partition-productive cone for the partition T , denoted by Q(T ), is defined to be the cone generated by the basis vectors restricted to the stoichiometric subspace As noted in the previous subsection, belonging to an FWMC partition is an equivalence relation.By definition, the species-productive cones of all equivalent autocatalytic subnetworks belonging to the partition T will lie in the partition-productive cone Q(T ).
Example III.4.In this example, we will identify the flow-productive, species-productive and partitionproductive cones for the MAS A 1 from Example III.3 (shown in Fig. 5).For The flow-productive cone is and the species-productive cone is Notice that the graph G * (from Example III.2) has a positive conservation law x = [1, 1, 2].The partition-productive cone for the partition For this example, the species-productive and partitionproductive cones are identical, which is an instance of a general result proved in Theorem IV.4.

IV. ORGANIZING AN AUTOCATALYTIC ECOSYSTEM
In the previous section, we have defined and proved properties of minimal autocatalytic subnetworks (MASs), or minimal CRNs that contain one autocatalytic core.Even mildly complicated CRNs can exhibit an abundance of autocatalytic cores [25,27].
Definition IV.1.We define an autocatalytic ecosystem to be a CRN containing one or more minimal autocatalytic subnetworks (MASs).
In this section, we investigate some mathematical properties of autocatalytic ecosystems and give computational algorithms to detect and organize them.In Sec.IV A, we prove some mathematical results about MASs.In Sec.IV B and Appendix D, we provide algorithms to exhaustively enumerate the MASs and identify their species-productive cones.Finally, we discuss the polyhedral geometry of an autocatalytic ecosystem and introduce a visualization scheme in Sec.IV C.

A. Mathematical results
To understand the geometry of the different cones defined in the previous subsection, we prove some results about their behavior.First, in Proposition IV.1 we give the conditions under which two MASs with different partitions will have non-intersecting species-productive regions.Next, in Theorem IV.4, we explore conditions under which the species-productive cone is identical with the partition-productive cone for a MAS.Under such conditions, the partition itself contains the information of the species-productive regions of all the MASs in that equivalence class.Finally, in Proposition IV.5, we clarify the topological properties a CRN must posses in order for the species-productive cones of two MASs to intersect.In Sec.IV C we will use these results to construct a visualization scheme for the list of MASs in any CRN.
Proposition IV.1.Two autocatalytic subnetworks with different food sets have disjoint species-productive regions if their non-core member species sets are empty.
Proof.We will show that if two autocatalytic subnetworks A and B have distinct food sets and their non-core member species sets are empty, then the interiors of their species-productive cones (species-productive regions) do not intersect, Recall that that the species-productive cone of the subnetworks lies within their partitionproductive cone Q(FWMC(A)) and Q(FWMC(B)).Let y be the vector of length S such that Then, by definition of the partition-productive cones, y • v > 0 and y • w ≤ 0, where v and w are any vectors in int(Q(FWMC(A))) and int(Q(FWMC(B))), respectively.Thus, y defines a hyperplane separating the two species-productive regions, and the two regions do not intersect.
Remark 21.If the non-core member species set is not empty, then the result is no longer true.For instance, see Example IV.2.
Example IV.1.Using Examples III.3 and III.4, the species-productive cones for A 1 and A 2 in the species set {A, B, C, D} are If we let y = [0, −1, 0, 0], then y • int(P(A 1 )) > 0 and y • int(P(A 2 )) < 0. Thus y defines a separating hyperplane, and the two species-productive regions are disjoint.
Lemma IV.2.The restriction of the species-productive cones of all MASs to their core (member) species is the non-negative orthant in the core species.
Proof.Recall that the species-productive cone of a MAS is defined as By definition of the matrix inverse, the restriction of the product to the core species is the identity matrix, Thus, the species-productive cone restricted to the core species is the non-negative orthant.
As remarked in Remark 16, a MAS cannot contain an inflow or outflow reaction.Moreover, for material CRNs, where the species are made of constituents that are conserved in a reaction, a MAS must also possess at least one conservation law.The following results are applicable for such MASs.(In Sec.V, we develop the formalism for CRNs with non-negative integer conservation laws and find their application in the examples.)Lemma IV.3.If a CRN has any positive conservation laws, the union of the food set and the non-core member set of any MAS must be non-empty.
Proof.Let A be a MAS, with associated stoichiometric matrix S, of a CRN with a positive conservation law given by the vector x.Assume for the sake of contradiction that A does not have any element in the food set A F or the non-core member set A M/C .For any flow v in the productive region of A, we know that S • v ≫ 0. Also, since x is a positive conservation law, x • S • v != 0. Since the sum of positive values cannot add up to zero, we have a contradiction.Thus, there must be at least one element in the food set or the non-core member set of a MAS with positive conservation laws.(See example IV.2.) Theorem IV.4.For a CRN with positive conservation laws, the species-productive cone of any MAS with exactly one more species than the core set is identical to its partition-productive cone.
Proof.Let A be a MAS of a CRN with positive conservation laws {x i } with exactly one more species than the core set, denoted by C = {C 1 , . . ., C C }. From Lemma IV. 3, we know that the extra species must be either in the food set or non-core member set of A. In either case, the species must be net consumed by the subnetwork to respect the positive conservation laws, and we denote it by f .Let us label the species set as {f, C 1 , . . ., C C }. Recall that the partition-productive cone is defined as Note that the additional basis vector for the non-core member species with positive entries is omitted since we know that it must be consumed in the species-productive region to respect the positive conservation laws.Moreover, using Lemma IV.2, we know that the restriction of the species-productive cone to the core species is the non-negative orthant in the core species.In particular, for the species-productive cone where f is a row vector [f 1 , . . ., f C ] and I C is the identity matrix of size C. Since the CRN also has conservation laws, these coefficients must satisfy for every conservation law indexed by i.But these are simply the basis vectors of the partition-productive cone B when made orthogonal to the conservation laws.Thus the partition-productive cone is identical to the speciesproductive cone.
Remark 22. Reversible CRNs are worth considering insofar as chemistry often assumes that all reactions are theoretically reversible (even if the rate constants of forward and reverse reactions differ greatly in magnitude).
Proposition IV.5.If the species-productive cones of two MASs have an intersection, the reversible CRN of their union has a null flow.
Proof.Let the productive cones of two MASs A and B, have a non-empty intersection, P(A) ∩ P(B) ̸ = ∅.Let A ∪ B be the subnetwork obtained by taking the union of the two subnetworks, and let S A∪B be the associated stoichiometric matrix.Then, there must be non-identical flows v A and v B with support in A and B, respectively, such that This implies that S A∪B • (v A − v B ) = 0, and thus the kernel of the stoichiometric matrix is non-empty.In general, the difference of their flows (v A − v B ) can have negative entries, however, it yields the interpretation of a null flow on a network if the union of the two subnetworks A ∪ B is made reversible.

B. Algorithms
In what follows we describe a mathematical optimization based approach to enumerate the whole list of exclusively autocatalytic cores in a CRN.The model that we propose uses the following variables: These variables are adequately combined by means of linear constraints to ensure the verification of the properties of Theorem III.1.Specifically: • The species in the Core set (y i = 1) are the productive species of the selected reactions: Note that in case y i = 1, the linear inequality is activated ensuring that the i-th species is produced in net.Otherwise, the constraint is redundant.
• All species in the Core set are both consumed and produced by the reactions in the core: r∈R: The two sets of inequalities above ensure that in case y i = 1, there must exist reactions in the core (those with z r = 1) for which species i with (S + ) r i > 0 (for (9)) and (S − ) r i > 0 (for ( 10)).• All reactions in the core both consume and produce core species: i∈S: i∈S: In case reaction r is part of the core (z r = 1), there exists at least one produced core species (inequality (11)) and at least one consumed species (inequality ( 12)).
• The non-selected reactions have zero-flow: In case reaction r is not included in the core (z r = 0), the flow component for this reaction is forced to be zero (v r = 0).Otherwise, the flow is unrestricted (v r ≤ 1).
Then, the master mixed integer programming (MILP) problem that allows us to generate the entire list of exclusively autocatalytic cores consists of minimizing the number of reactions in a network with all the above conditions: 8) − (13), The general procedure consists of sequentially solving the above optimization problem and adding to it new linear constraints to allow the generation of different cores.A flowchart of our procedure to construct the minimal exclusively autocatalytic cores is shown in Fig. 6.Define a list of sets of incompatible reactions, Incomp, initialized to the empty set, that stores, at each iteration, the sets of reactions that are part of the already computed MASs, and avoids generating MASs containing those sets of reactions.Then, in the first step, our algorithm computes an autocatalytic subnetwork by selecting a set of core species and reactions taking part in a subnetwork with minimum number of reactions.The obtained subnetwork is stored, and its FWMC partition is computed.This set of reactions, say R ′ , is added to Incomp, and the procedure is repeated, by adding the following constraint to the MILP to assure that the subnetworks obtained in future iterations do not contain any of the sets of reactions in Incomp: The algorithm stops when there does not exist sets of reactions not incompatible with those in Incomp.Minimality is assured by the minimization of the number of reactions in the subnetwork in each run and by the conditions imposed by the set Incomp.Note that the complete enumeration of all the possible combinations of reactions/species that may be included in a MAS, and the detection of a MAS is computationally prohibitive in practice.However, our procedure avoids this enumeration by solving an MILP at each iteration.Moreover, if one is interested on finding just a certain number, k, of MAS, it can be done by terminating the algorithm after k iterations and giving as output the k obtained MASs.The identification of the FWMC partition for a given autocatalytic subnetwork results from checking both the signs of the restricted stoichiometric matrix, S and the production vector Sv * for the flow v * also obtained after each iteration.Specifically, for each species i in the subnetwork: if all the components in S i are non positive (resp.non negative), i is classified as a food (resp.waste) species; in case Si has negative and positive entries, i is classified as a member species; and if S i has negative and positive entries and (Sv * ) i is strictly positive, then i is a core species.
Once the set of MASs is obtained, they are classified by means of their FWMC partitions.(Additional checks must be made to determine if there is more than one autocatalytic core in a MAS.In this case, the FWMC partition of the subnetwork is non-unique, and the MAS must be assigned to all the FWMC classes to which its autocatalytic cores belong.)Within each FWMC class, we check the pairwise intersection of the species-productive cones.For each subnetwork A 1 , A 2 in the same class, we first check whether the cones P(A 1 ) and P(A 2 ) intersect by finding a nonzero vector shared by both cones.In case the cones intersect, we check whether the cones are identical, one is contained in the other, or they partially intersect.This check is performed by computing the distances between the (normalized) generators of the cones.If all these distances are zero, the cones are identical; if one of the sets of distances is zero but the other is not, one of the cones is contained in the other; but if in both sets there are positive distances, the cones partially intersect.Using the same procedure, we also check the pairwise intersection of the partition-productive cones of each equivalence class.Further details on the described approaches are provided in Appendix D 1.

C. Geometry and visualization
In the last subsection, we proposed an algorithm that, given a stoichiometric matrix of a CRN, outputs a list of the MASs it contains.In this section, we will explain how to take the list of MASs and visualize their combinatorics and geometry.Recall that for each MAS, we define a flow-productive cone in the flow space of the graph where each core member species is strictly produced, and a species-productive cone on the space of changes in concentration (population).Geometrically, the space of changes in chemical concentration (i.e., the velocity space of chemical concentrations) is the stoichiometric subspace of the CRN.Thus, the list of MASs can be seen as yielding a partial polyhedral decomposition of the stoichiometric space induced from the flows on the hypergraph (CRN).We remark that, for a more complete decomposition one must also consider the consumptive cones of the minimal drainable subnetworks.However, it is not necessary that the union of the autocatalytic and drainable subnetworks span the complete stoichiometric space (for example, see Sec.V B 2 Fig. 14).
In subsec.III B, we showed that each subnetwork can be assigned its FWMC equivalence class(es).As explained in Proposition IV.1, while there are cases where the species-productive cones of different equivalence classes can be shown to not intersect, in general the productive regions of different equivalence classes can intersect.To visualize the list of equivalence classes to which the MASs belong, we run the algorithm for finding an intersection between each pair of partition-productive cones and obtain a two-dimensional square matrix, C pp , of dimension equal to the number of equivalence classes.Let us denote the list of equivalence classes by L. The entry of C pp in the i th row and j th columns is given by, Notice that any asymmetry in entries across the diagonal indicates that only one of the partition-productive cones completely contain the other.This matrix can then be visualized as a heat map, for example see Fig. 7.
Each equivalence class can contain several MASs.The species-productive cones of the MASs in a class are not always identical.While they will share the same projection on the core species, the productive regions in the non-core species can be very different.For example, let us pick the equivalence class {{4}, {1}, {{2, 3, 5}}} from the list of classes shown in Fig. 7. From the figure, we know that it contains 5 MASs.We plot the projection of the species-productive region in the non-core species in the top panel of Fig. 8.In higher dimensions when more non-core members are involved, this representation can get rather cumbersome.Thus, we employ a similar visualization as C pp to depict the intersection of MASs within an equivalence class.Whether or not there is an intersection between the species-productive cones can be ascertained using our algorithm outlined in the previous subsection.For an example of the resulting visualization for the same equivalence class considered above, see the bottom panel of Fig. 8.In the same manner, a visualization for the information of pairwise intersection of the productive cones of all MASs for a CRN can also be obtained, for example see Fig. 9.
Example IV.3.Consider the complete L = 5 1-constituent CCRN of order two (introduced in Section V).The list of all the FWMC classes, the number of MASs they contain, and their intersection information is summarized in Fig. 7. Consider the class {{4}, {1}, {{2, 3, 5}}}.It contains the MASs: The projection of the species-productive cones for the above MASs and their intersection information is summarized in Fig. 8.

V. CLUSTER CHEMICAL REACTION NETWORKS (CCRN) FRAMEWORK
All fully-balanced chemical reactions conserve the number of atoms of each type.Often, but not always, reaction models of polymerization and hydrolysis also conserve the number of each monomer.The existence of such conservation laws allows us to coarse-grain the underlying detailed CRNs, projecting away the bond structure of each molecule or polymer and only counting the number of atoms or monomers, respectively.Following terminology from Chapter 7 of [28], we will refer to this coarse-grained description of each species as a cluster (see Fig. 10).This also allows us to coarse-grain a realistic CRN into a new CRN which has clusters as species and reactions between multisets of clusters induced from the reactions in the original CRN.We will call the resulting CRN, a cluster chemical reaction network (CCRN) obtained from the original CRN.
There are three advantages of considering CCRN descriptions of CRNs.First, consider a CRN whose species set consists of all linear polymers formed of D monomers up to length N .Then, the CCRN induced by the CRN would reduce the number of species from exponential (D N ) to polynomial (N D ) in the length of the polymer, reducing number of species and changing the connectivitity in the reaction graph from a tree to a lattice for many reaction models.Secondly, since each reaction in the original CRN induces a reaction in the CCRN with the same topology, the mapping would preserve the autocatalytic property.In other words, if a CRN is autocatalytic, its induced CCRN will also be autocatalytic.However, since it is a many-to-one map, multiple autocatalytic subnetworks of the CRN might map to the same autocatalytic subnetwork in its CCRN or the CCRN might contain MASs that do not correspond to any motif in the CRN (see Example V.1).Thirdly, one can later re-introduce structure into the CCRN by adding more species with the same cluster counts (conserved quantities).In this way, the coarse-graining can be systematically refined to recover the original CRN by adding new dimensions, and allows a gradual complexification of the model.The CCRN obtained by counting the number of As and Bs in each string and representing the resulting clusters Note that CCRN(G) * has identical information to the original CRN and should, therefore, yield identical inferences to the original CRN.
We formally define a CCRN in Sec.V A, and systematically explore the properties of CCRNs with one conserved quantity (type of atom or monomer) in Sec.V B. We also provide the statistics of autocatalytic subnetworks found in such reaction networks and present a workedout-example in Sec.V B 2. We briefly discuss the computational challenges in scaling the algorithm for fully connected models in Sec.V B 4 and introduce rule-generated CCRNs in Sec.V C.

A. Formalism
A cluster chemical reaction network (CCRN) is a CRN with the additional structure that, upon excluding the inflow and outflow reactions (see Remark 16), the network has at least one nonnegative integer conservation law (see Sec. II D).Each conservation law x yields a conserved quantity x • n for a population in state n.Each conservation law corresponds to a type of constituent that is conserved, and the magnitude of the conserved quantity denotes the amount of that constituent in the population state.We denote the number of distinct constituents by D and label them as A 1 , . . ., A D .The species of a CCRN, termed clusters, are multisets of constituents and denoted by an overline (notation chosen to be consistent with [29]) over a vector of nonnegative integers representing the number of each constituent in the cluster.Notation V.1.n := n 1 , . . ., n D will be used to denote a cluster comprising n 1 , . . ., n D of constituents A 1 , . . ., A D , respectively.
Denoting the population state with exactly one n type particle as 1 n , if the conservation laws are labelled x d where d takes values from 1 to D, then A cluster thus corresponds to the vector of its conserved quantities.In chemistry, the compositional formula of a chemical is an example of its cluster representation.For example, the cluster of water or H 2 O, in a basis of constituents {H, O} will be represented as 2, 1.Here the first and second conservation laws counts the number of H and O atoms, respectively.In case of multiple clusters with identical vectors of conserved quantities, we distinguish them with an asterisk * .For example, in Table II,  1 and 1 * refer to a monomer and an activated monomer, respectively.Definition V.1.We define the length of a cluster to be the sum of its conserved quantities, The length of a cluster is a scalar quantity.
In a CCRN, complexes are multisets of clusters and are denoted by As an abuse of notation, we denote the stoichiometry of complex c α also by c α , where now it is the column vector with entries c α n .Definition V.2.The size per constituent of a complex is the vector sum of conserved quantities of each type, and is denoted Definition V.3.The width of a complex is the total number of clusters in the multiset, and denoted by In a CCRN, every reaction must be such that the sizes per constituent of the source and target complexes are identical.Thus, a reaction c α → c β is allowed only if This restriction is required in order for the constituents to be conserved quantities, as For a CCRN H D : • The length of the CCRN is defined to be the maximum length of its clusters, and denoted • The order of the CCRN is defined to be the maximum width of its complexes, and denoted by Definition V.5.We define a complete CCRN of length L and order w to consist of all species up to length L and all possible reactions that are allowed by the conservation laws with complexes of width up to w.
Remark 24.Since any reaction consisting of more than two reactants or products can be written as chains of second-order reactions, for our analysis we will restrict to CCRNs of order two.Notice that this restriction ensures that any exclusively autocatalytic subnetwork will also be stoichiometrically autocatalytic because the reactants and products are necessarily distinct.Let H L 1 = (S, C, R) denote a complete 1-constituent CCRN of length L and order two.1-constituent CCRN of length L means that the species set consists of clusters up to length L, denoted S = {1, 2, . . ., L}.By order two, we mean that all complexes are at most of width two, and the set of complexes is denoted as C = {a} ∪ {a + b} for a, b ∈ S and b ≤ a ≤ L. By a complete CCRN, we mean that the set of reactions contains all the possible reactions that are allowed by the conservation law, i.e.
The first few linkage classes of the complete 1-constituent CCRN of order two are shown in Fig. 11 and the complete CCRN for L = 4 is highlighted in yellow.The information about the reaction networks and the MASs that they contain for the complete 1-constituent CCRNs of L ranging from 3 to 7 is collected in Table I and Fig. 12.
Consider the complete 1-constituent CCRN with L = 4 of order two, the reaction network for which is shown in Fig. 11.Notice that this subnetwork has 5 linkage classes, 11 complexes, and 14 reactions.An application of the algorithm for enumerating all the MASs of the CCRN discussed in Sec.IV B yields a list of 20 subnetworks, as shown in Fig. 3.In the list, there are 9 MASs with two reactions and 11 MASs with three reactions.Since the CCRN has 4 species, and 1 conservation law, the stoichiometric subspace is of dimension 3.Moreover, due to the conservation law, from Lemma IV.3 each MAS must possess at least one non-core species.This means that there cannot be any 4 reaction MAS as it would require at least 5 distinct species, which would be a contradiction.
Upon obtaining a list of MASs for a given CRN, we would like to understand how their species-productive and partition-productive cones intersect.Observe that each MAS in the L = 4 CCRN considered above has a unique autocatalytic core.For the CCRN, using Theorem IV.4, any MASs with the same FWMC partitions have identical species-productive cones and it is identical to the partition-productive cone.Thus, we do not show any plot for the species-productive cone intersection data.Moreover, any two partitions consisting of the same core of 3 species must have the same partition-productive cone, as the stoichiometric subspace is of dimension 3.For example, the partition-productive cones of {{1}, {}, {{2, 3, 4}}} and {{}, {}, {1, {2, 3, 4}}} are identical.The intersections of partition-productive cones for different FWMC partitions for the complete 1constituent CCRN with L = 4 are shown in the Fig. 13.
Unlike the L = 4 case, for the complete 1-constituent CCRN with L = 5 of order two, there can be equivalent MASs with identical FWMC-partitions whose speciesproductive cones do not intersect.Moreover, it is not a priori clear whether the partition-productive cones of different FWMC partitions will intersect.This is why we employed our visualization scheme in Fig. 7.As described in Sec.IV C, for a particular equivalence class we also show the intersection information of the speciesproductive cones in Fig. 8.

L=3 1-constituent CCRN
Consider the complete 1-constituent CCRN with L = 4 of order two, given by This network has two MASs, namely Since the CCRN has one conservation law, the stoichiometric subspace is of dimension 2 and perpendicular to [1, 2, 3] T .The two-dimensional stoichiometric subspace is shown in Fig. 14.The 6 rays emanating from the origin, labelled by their directions in the 3−dimensional space, form the edges of productive cones where one species is consumed and another is produced.The speciesproductive cones of the two autocatalytic cycles are also labelled.Note that the species-productive regions is the complete cone within the bounding rays, and the finite boundary is drawn simply to enhance visualization.For example, the species-productive cone of the MAS A 1 is bounded by the rays [−2, 1, 0] and [−3, 0, 1].Notice that the FWMC partition of A 1 is {{1}, {}, {{2, 3}}}, consistent with the species-productive region that strictly consumes species 1 and strictly produces species 2 and 3.
We also want to make a few remarks about details that are not visible in Fig. 14.Firstly, for every MAS, there is a minimal drainable subnetwork obtained by reversing the reaction edges.The species-productive cones of these drainable networks will be the negatives of their autocatalytic counterparts.Secondly, notice that there is no MAS with the partition {{2}, {}, {{1, 3}}}.Had there been one, as would have been the case had we allowed reactions of order 3, then its species-productive cone would be bound by the rays [0, −3/2, 1] and [1, −1/2, 0].
Notice from Table I that the complete CCRN with L = 3 has a deficiency of one.This means that under mass-action kinetics, it has the potential to exhibit multistability.To investigate this, we used Feinberg's deficiency one algorithm [30] in Appendix C and obtained rate constants for which the CCRN has multiple steady states.An example of multiple steady states and the rate constants using which they are obtained are reported in Fig. 15.

Number theoretic result for 1-constituent CCRN of order two
Since CCRNs are defined by number conservation laws, finding autocatalytic subnetworks in any CCRN corresponds to finding number-valued solutions of systems of equations.In this subsection we will show that if there exists a two-reaction MAS in a partition of a 1constituent CCRN, then it must be unique.Lemma V.1.Let A be a 2-reactions MAS in a 1constituent CCRN of order two.Then, there does not  Proof.Let A C = {i, j} be the core species in A (assume without loss of generality that i > j).The possible 2reaction MASs with that set of core species are given by reactions of the form Note that each non-core species can appear with the value 0, indicating the absence of a reactant.However, due to Lemma IV.3, the conservation law will require that at least one food species is non zero.Observe, also, that in both types of potential MASs there is a reaction which is uniquely determined by ī and j (being then l3 and m1 also unique for this choice of the core set).Concretely l3 = 2i − j and m1 = 2j − i.Note that these reactions are different since it is assumed that i ̸ = j.Thus, the above reactions reduce to: Now we will do a case-by-case analysis to show that each FWMC-partition will contain at most one solution of the systems of equations.
Note that the only possible partition of the species induced by the reactions in M 1 is: F W M C(M 1 ) = { j + l− ī, 2i − j}, { l}, {{ ī, j}} whereas the partition for M 2 is: F W M C(M 2 ) = {2j − i, ī + m − j, }, { m}, {{ ī, j}} From both types of partition, the only possibility for which both coincide is that l = m and ī = j contradicting the assumption that i > j.
Given two different species i, j with i > j ≥ 1, the above result also provides a way to construct all the 2reactions MASs with those species being the core species.The explicit reactions taking part of the MASs are in the form: for all m ∈ [0, L + j − i] ∩ Z.We wish to remark that for H 2 and other CRNs with a large number of reactions, generating the whole set of autocatalytic subnetworks is computationally challenging.As the number of reaction increases, the dimension of the space where the MASs are to be found also increases, and even finding a single MAS in the network implies solving a difficult optimization problem that might be computationally costly.Even if finding the MASs is polynomial-time solvable (which does not seem to be the case), the complexity of the enumeration algorithm may turn into exponential (for e.g., see [31]), since the number of subnetworks increases considerably with the number of species and reactions.Thus for practical reasons, it may be more feasible to consider sparser rule-generated networks, as explained in the next subsection and shown in Fig. II.

C. Rule generated CCRNs
A naive computational modelling of artificial chemistry [32] often runs into issues of memory, storage and processing due to a combinatorial explosion of chemical species and reactions involved in the CRN.For instance, if one uses string chemistry [33] (or polymer sequence chemistry) to model polymers of length up to N with D distinct monomers (D constituents), a straightforward calculation shows that one is required to track N i=1 D i > D N species.Since for any physically relevant model N ≫ D, so it is clear that the computation will soon become intractable as one increases the length of the string (polymer) N .Even a simple graph-grammar for generating artificial chemistry, for e.g. with the application of [34], can generate a reaction network that is computationally intractable.Moreover, even in the computationally tractable regime, an enumeration of autocatalytic subnetworks or motifs will generically yield an extremely long list.In such a scenario, it might be unclear how to simplify the model without stripping away the essential details or adding artifacts.
We argue that the Cluster framework can help alleviate some of the computational issues.Any rule generated CRN, in string chemistry or built using graph-grammar methods must induce rules on the CCRN.Since the induced CCRN can have an exponentially reduced species set, it offers a way to constrain the model to a more computationally tractable regime.Since autocatalytic Table II: Several examples of rules for generating a sparse CCRN relevant to biochemistry.Here, in accordance with the definition of a CCRN, the variables j, k must be chosen such that the resulting clusters have a non-negative integer quantity of each constituent.
networks are preserved under this coarsening (with some caveats mentioned in the introduction to this section), one can use the CCRN to obtain a more manageable list of MASs.One can then gradually complexify the CCRN by adding more species with the same conserved quantities until the desired behavior of the CRN is captured by the CCRN.For example, consider polymer chemistry with two monomers A and B. For an arbitrary polymer ω, the addition of an A or B yields ω • A or ω • B, respectively.In the cluster framework, if we map A, B, and ω to 1, 0, 0, 1, and j, k, respectively, the polymerization reactions in the CCRN become In realistic chemistry, however, a monomer addition to a polymer is only done by an activated monomer and not an unactivated monomer.Thus, we may want to distinguish activated and unactivated monomers in our model, for which we can add extra species 1, 0 * and 0, 1 * .The resulting polymerization reactions and other examples of rule generated CCRNs are shown in Table II.

VI. DISCUSSION AND FUTURE RESEARCH
In this work, we began by presenting a linear-algebraic representation of CRNs in terms of an input-output matrix pair.We introduced different notions of autocatalysis existing in literature as nested classes of conditions on the CRN.In particular, the results and analysis in this work is for the class of exclusive autocatalysis [6], which is more encompassing of true cases than the notion of autocatalysis used by Blokhuis et al. in [11].We defined minimal autocatalytic subnetworks (MASs), proved properties of their stoichiometric matrices, and explained their induced polyhedral geometry.Next, we proposed mathematical optimization based algorithms to exhaustively compute all the MASs in a CRN and create their visualizations.Finally, we introduced the notion of cluster CRNs (CCRNs) as a useful coarse-graining induced by general CRNs, and employed our techniques on CCRNs obtained from one constituent.Notably, we show that the list of MASs for maximally connected 1-constituent CCRNs with length up to L increases exponentially with L.
The mathematical theory developed here facilitated the development of effective algorithms for identifying MASs in general CRNs.Nevertheless, our optimizationbased approach requires solving a series of binary mathematical programming problems with an increasing number of linear constraints.This approach is computationally costly for large CRNs.However, there are reasons to hope that further research on these optimization problems, in particular, its polyhedral properties, will allow us to strengthen the formulations and solve them more efficiently.Exploiting the algebraic properties behind conservation laws for sequences of integers holds particular promise in this regard.We have also introduced a simple coarse graining of CRNs as CCRNs.By adjusting the number of dimensions included in a CCRN it is possible to trade-off computational tractability against fidelity to the complete CRN and to accurately capture all autocatalytic motifs.This flexibility is a great strength of the CCRN model.Additionally, although we here only considered clusters with nonnegative constituents, it also should be possible to consider clusters defined on the complete integer lattice.Such a reaction network could be used to model nuclear reactions where the positive conserved quantities correspond to positive charges or subatomic particles and the negative conserved quantities would refer to negative charges or antiparticles.
In our explorations we only considered fully connected CCRNs, but some reactions that are consistent with mass conservation are , nonetheless, impossible in real chemistry.In future work it would be interesting, similar to [35,36], to consider randomly generated CCRNs in an Erdos-Renyi framework and compute a bound on the number of autocatalytic cycles that persist as a function of the probability of a reaction's being allowed in the network.An additional, and perhaps even more promising approach to representing real chemistry would be to ex-pand our understanding of the kinetics of fully-connected CCRNs when rate constants differ over many orders of magnitude.Such an effort might be aided by understanding the effects of adding thermodynamic parameters such as free-energy assignments to different clusters.
Overall, the mathematical framework and computational tools developed here have potential applications for any kind of CRN that has multiple MASs: an autocatalytic ecosystem.This is of particular significance for better connecting biology and chemistry at multiple scales of analysis.As discussed by [37], a cell's ability to grow and reproduce rests on the fact that its metabolism is an autocatalytic ecosystem.Even ignoring autocatalytic motifs that include genetic mechanisms (i.e., those whose members are proteins or nucleic acids), metabolic networks contain many autocatalytic motifs [25].The tools developed here will facilitate detecting MASs and their ecological interactions within metabolic networks, which has the potential to help elucidate diverse biochemical processes.
Studies of natural ecological communities, from coral reefs to rainforests, could also be informed by studies of autocatalytic CRNs.An organism (a cell or multicellular system) can be coarse-grained as a single autocatalytic motif composed of numerous lower-level cooperating MASs.Here, the chemical complex that corresponds to the organism is a member of an autocatalytic motif, which sits in a species-productive cone due to the input of food derived from the physical environment (e.g., light and carbon dioxide in the case of photosynthetic organisms) and/or other from other organisms.However, although it ought to be possible to apply the tools developed here to enrich community ecosystem, in practice the lack of stoichiometric data for ecology (e.g., how many rabbits react with one wolf to make two wolves?)make this challenging at the current time.
Even if we currently lack the data sets needed to mathematically analyze metabolisms or other biological autocatalytic ecosystems, the work here has important implications for studying origins of life.Biological autocatalysis is not discretely different from that found in nonbiological CRNs.The continuity between the two implies that the origin of life can be viewed as a process of expansion and complexification of autocatalytic ecosystems via the progressive activation (i.e., transitioning to positive flux) of more and more MASs over time.In future work, we hope to introduce kinetics to CCRNs and expand the insights gained here to identify conditions that permit and promote the open-ended exploration of the attractor space associated with CCRNs in an autocatalytic ecosystem.exhibit multistability or multiple steady states.In certain cases, such as for deficiency one networks [30], there exist algorithms that can help classify whether or not multiple steady states can exist for a given CRN and identify the rate constants such that under mass-action kinetics it exhibits multistability.Once a model exhibiting multistability is finalized, one can also use algorithms from [19,41] to estimate the most-likely path the system will take, along with its probability, to escape from one steady state to another.
Appendix B: Kernel of the stoichiometric matrix for the complete 1-constituent CCRN Let H L 1 = (S, C, R) denote a complete 1-constituent CCRN of length L and order two.1-constituent CCRN of length L means that the species set consists of clusters up to length L, denoted as S = {1, 2, . . ., L} (for a detailed definition, see Sec.V B 1.) For a diagrammatic representation of the reaction network, see Fig. 16.Here, following Appendix A, we will calculate the kernel of the stoichiometric matrix for a general length L.
First, we will count the number of complexes in a complete 1-constituent CCRN of length L. Let us denote the set of complexes of size N as C N L .Recall that the size per constituent of a complex is defined as the sum of the conserved quantities in each constituent.For 2 ≤ N ≤ L, Thus, the number of complexes as a function of N and L are The total number of complexes for a CCRN of length L is thus This can be found by either summing the individual contributions or by observing from Fig. 16 that the complexes occupy the lower triangle including the diagonal for a lattice with L + 1 rows and columns excluding 4 points.Let us denote the set of reactions of complexes of size N by R N L .Since the reaction network of the complete CCRN is obtained by connecting all allowed reactions, for complexes of size N , the number of reactions is given by where the factor of 2 is multiplied since all reactions are reversible.Since, each N contributes exactly one linkage class (or connected component), the total number of linkage classes ℓ for the complete CCRN is 2L − 3.
Recall from Appendix A, the kernel of the stoichiometric matrix is the sum of the independent loops ı and the deficiency δ, where using Equations A4 and A5 we have The intersection Im(M) ∩ Ker(Y ), the dimension of which is the deficiency δ, counts the null flows of the CCRN which are not simple loops in the network.While an explicit basis can be obtained using subnetworks of the form we will not be pursuing it since we do not need it for any result in this work.
Appendix C: Deficiency-one algorithm for L=3 1-constituent CCRN The complete L = 3 1-constituent CCRN is the simplest complete CCRN, and as we will see, it possesses non-trivial dynamical properties.In particular, from Table I, it can be seen that the deficiency δ for the CCRN is one.This hints at the possibility that the CRN, when taken with mass-action kinetics, exhibits multistability.In this section, using Feinberg's deficiency-one algorithm from [13,30], we will confirm that the network can indeed exhibit multiple steady states, and find a relation between the rate constants and concentrations where that is the case.The algorithm works by converting the CRN into a system of linear inequalities, and if they have a solution, they can be used to obtain a relation between the rate constants and multiple steady states.
The L = 3 1-constituent CCRN, denoted by G = (S, C, R), is given by the species set S = {1, 2, 3}, complex set C = {1 + 1, 2, 1 + 2, 3, 1 + 3, 2 + 2} and reaction set Notice that the above reaction network is a regular [30] deficiency-one reaction network, and is thus a valid candidate for the deficiency-one algorithm.To proceed with our calculation, we will follow closely the example in Chapter 17 of [13] and Sec. 5 in [30].
The confluence vector g ∈ R C for our network can be calculated to be (up to sign) For our construction, we will use the following uppermiddle-lower partition The partition induces the following system of inequalities on the chemical potential, The inequalities are satisfied by which clearly have a solution, for e.g.µ 1 = 1, µ 2 = 3, µ 3 = 4. Thus, the CCRN can exhibit multiple steady states.
To find the rate constants at which the system exhibits multistability, we use subsection 5.3 in [30].Following the reference, let us denote the mass-action kinetics rate vector by κ y→y ′ := k y→y ′ x y , where x is the concentration vector and x y := i∈S x yi i .Choosing the monotonically increasing function to be the exponential ϕ(x) = e x , and choosing η so that µ 1 + µ 3 < η < µ 2 we get From [30], it is also known that if x := (x 1 , x 2 , x 3 ), then so is x * := (x 1 e µ 1 , x 2 e µ 2 , x 3 e µ 3 ), thus yielding multiple steady states for our system.
Appendix D: Details of the algorithm for detecting stoichiometrically autocatalytic motifs In Section IV B, we give the main ideas behind our algorithm to construct exclusively autocatalytic subnetworks using the input and output stoichiometric matrices.In what follows we detail the algorithm that we propose to enumerate the entire list of minimal stoichiometrically autocatalytic subnetworks of a CRN, and that we applied to the CCRNs in our computational experiments.
Notation D.1.In this section, we abbreviate minimal stoichiometrically autocatalytic subnetwork as MAS.
The goal of the algorithm that we propose is to determine sets of core species and reactions conforming to the requirements of a MAS by solving, sequentially, a series of nested mathematical optimization problems.Apart from the variables y, z and v already described in Section IV B, we use the following set of variables in our Mixed-Integer Linear Programming (MILP) problem: 1 if species i is a non-core present species in the MAS, 0 otherwise , and ∆ i is a constant to be defined later for all i ∈ S.
The Master optimization problem to construct MASs in this case is the following: y i + q i ≥ z r , ∀r ∈ R, i ∈ S with S r i > 0 ̸ = 0, (D5) There, the objective function accounts for minimizing the number of reactions in the MAS.This objective assures, that each of the generated autocatalytic subnetworks is support-inclusion minimal.Constraints (D2) allow the correct representation of variable z r (if v r > 0, then z r is forced to take value one, if v r = 0, by the minimization criterion, z r will also take value 0).Constraint (D3) avoids solutions with a single reaction.Constraints (D4) assure that species are either core species or present noncore species but not both (it is also possible that they do not participate in the MAS).In case y i + q i = 1, species i is involved in the MASs either as food, member, core or waste species.Otherwise, species i is not a part of the MAS.Constraints (D5) assure that in case a reaction r is selected to be part of the MAS (z j = 1), the reactants and product species of reaction r must be present (either as core species or present non-core species).Since q i and y i are binary variables, the sum of them being greater than one (but because of the previous constraint the equality holds) implies that either i is a core species (y i = 1), or non-core present species (q i = 1).Otherwise, the constraint is redundant.Constraints (D6) and (D7) ensure that in case a reaction is in the MAS, there must be a core species as a reactant and another as a product in the restricted stoichiometric matrix.Observe that in case z r = 1, the sum of the binary variables y i with i being reactant/product of r must be at least one, implying that at least one of the reactant/product species must be activated as core species.In case z r = 0, the constraint is redundant.Similarly, Constraints (D8) enforces that in case a species is selected to be a present non-core species (q i = 1), then there must be a reaction where it is present.(D11) is the productivity condition (see Remark 6).There, in case y i = 1 (i is a core species in the motif), the right-hand side in the constraint becomes 1, implying that r∈R S r i v r ≥ ε > 0 (here ε is a given positive but small enough constant to assure that the production is positive).Constants ∆ i , for i ∈ S, are assumed to be big enough assuring that in case i is a non-core (present or not) species (y i = 0), the production of species i with flow vector v must always be greater than ε − ∆ i (making the constraint redundant).Specifically, for those non-core species whose production is negative (as food species), the absolute value production is upper bounded by ∆ i .We take ∆ i = − r∈R: S r i <0 S r i , for all i ∈ S. The above model belongs to the class of discrete optimization problems, which are known to be computationally challenging [42], but at the same time tremendously useful in determining solutions of complex decision problems in various fields [43].Although the problem of detecting a MAS is known to be a NP-complete [44], there are several available softwares with implemented routines to obtain solutions of integer programming problems (such as Gurobi, CPLEX, ...) in reasonable CPU time for medium size instances.
In order to speed up the solution approach, the above master problem can be strengthened by adding valid inequalities that reduce the search of solutions.For instance, as shown in Theorem III.4,since the number of reactions in an MAS coincides with the number of core species that are part of it, one can add this (redundant) constraint to the model: Furthermore, in a reversible CRN, denoting by Rev = {(r, r ′ ) ∈ R × R : reaction r is the reverse of reaction r ′ }, one must also impose the following constraints that avoid using a reaction and its reverse in the same subnetwork: That is, if r and r ′ are reverse reaction of each other, the z-variables of those reactions cannot be both one.The solution of the problem above is a MAS with minimum number of reactions.Once the problem is solved, with optimal values in the variables (ȳ, q, z, v), the set of core species is obtained as: Core = {i ∈ S : ȳi = 1}, and the set of reactions involved in the MAS is: R ′ = {r ∈ R : zr = 1}.
Among the set of present non-core species, Q = {i ∈ N : qi = 1}, one can classify them based on the sign of the entries of the restricted stoichiometric matrix: The algorithm to generate the whole set of MASs is described in Algorithm 1. There, at each iteration, ℓ, the above Master problem is solved.Once a solution is obtained, say (y ℓ , q ℓ , z ℓ , v ℓ ), the tuple of reactions that are part of the MAS is considered to restrict the next MASs.In particular, it is disallowed for the next MAS to use the same reactions tuples, which assures the minimality of the generated MASs.This restriction is incorporated in the model by means of Constraint (D14).The process is repeated until no more autocatalytic subnetwork can be found respecting the pool of incompatibilities that have been added during the procedure.With this algorithm the MASs are generated while being sorted by number of reactions involved in the subnetwork since each time the problem is solved the number of reactions is minimized.
Data: S, R, S, ℓ = 0 while Feasible do Solve Master: Mas ℓ := (y ℓ , q ℓ , z ℓ , v ℓ ).if Master is feasible then Save the solution and add to Master the constraint: Analogously, one can also generate MASs where just a given set of species takes part in the core species set, S 0 ⊆ S, by adding: As already mentioned, computing the entire list of MASs in a CRN is computationally challenging.Each iteration in the proposed procedure requires solving a mixed integer linear optimization problem with the constraints in the initial Master problem, but also those in the shape of (D14) that are added each time a MASs is found.Thus, the difficulty of solving this optimization problem increases with the number of MASs for a given CRN.In Figure (17) we show the performance of the cumulative CPU time required to generate the MASs for 1-constituent CCRNS with L = 4, 5, and 6.As can be observed from the plots, the consumed CPU time increases linearly in the number of MASs, with a slope strictly greater than one.Table III: List of reactions for a CCRN with 1 constinuent and L = 4.

Checking intersection of cones
Once the whole set of MASs has been generated, M = {Mas 1 , . . ., Mas K }, we have also developed an optimization-based methodology to check the pairwise intersection of the different polyhedral flow-productive cones and partition-productive cones induced from the set M. Consider two of these cones in R d + , say C 1 (v 1 , . . ., v s ) and C 2 (w 1 , . . ., w r ), described by means of its generating vectors (see Section III C).We first check if they the interior of the cones intersect or not by finding a non negative vector that belongs to the intersection of the closed cones maximizing the sum of the entries of that vector with respect to each of the generators of the

Figure 2 :
Figure 2: In the left panel, the Euclidean embedded graph (E-graph, see [20]) of the CRN in Example II.1 is shown.The E-graph is obtained by representing the complexes as lattice points in a Euclidean lattice and representing the reactions as edges between them.The flows on the reactions are labelled by κ and the change in concentration they induce, ∆n = Sκ, is schematically represented in the right panel.

Figure 3 :
Figure 3: All minimal autocatalytic subnetworks (MASs) for the complete 1-constituent CCRN with L = 4 are shown.The core types are also color coded using the typology of Blokhuis et al. in[11].In a CCRN, each species is denoted by a vector of positive integers with an overbar, and each reaction satisfies certain conservation laws.For a definition of a CCRN, see Sec.V.The list of reactions can be found in Appendix E TableIII.

Definition III. 4 .
Let a and b be two autocatalytic cores embedded in MASs A and B. The autocatalytic cores a and b are equivalent if and only if the FWMC partitions of A and B under their associated autocatalytic cores is identical.a ≡ b ⇐⇒ FWMC(A) = FWMC(B).

Figure 4 :
Figure 4: Food-waste-member-core and resource-member partition of a minimal autocatalytic subnetwork in the 1-constituent Cluster CRN (CCRN) with L = 6.Using the typology of Blokhuis et al. this subnetwork has a type two core.

Theorem III. 5 . 1 C
The columns of S −lie in the semipositive orthant.

Figure 5 :
Figure 5: The flow and species productive cones for the Example III.4 are shown in the left and right panel, respectively.The dimensions of the embedding space of the flow and species productive cones are the number of reactions and species in the minimal autocatalytic subnetwork, respectively.Notice that the food species B is strictly consumed (nonpositive) and the core species A, C are strictly produced (nonnegative) in the species-productive region.For this example, the partition-productive cone is identical to the species-productive cone.

Figure 6 :
Figure 6: Flowchart of our procedure to construct minimal autocatalytic subnetworks.

Figure 7 :
Figure 7: The intersection data of partition-productive cones for all partitions of the complete L = 5 1-constituent CCRN (described in Sec.V B 1).

Figure 8 :
Figure 8: All MASs belonging to the {{4}, {1}, {{2, 3, 5}}} class in the complete L = 5 1-constituent CCRN are considered.The projections of their species-productive cones on the food and waste species are shown in the left panel (where the change in the concentration of species n is denoted as ∆n).The right panel displays the intersection information of MASs within a class analogous to Fig. 7.Note that there are no MASs in the class that are disjoint, and thus the 'disjoint' label is omitted.

Figure 9 :
Figure 9: The pairwise intersection statistic for all the species-productive cones of the L = 5 1-constituent complete CCRN of order two.The reaction network and the list of MASs can be found in Appendix E.

Figure 10 :
Figure 10: Consider a CRN of polymers whose species set is arranged as a binary tree in two letters, shown to the left.Then the cluster resulting from each binary string is obtained by counting the number of occurrences of each alphabet.The clusterization process projects from a tree structure over polymers to a lattice structure over clusters, shown to the right.
n n(c β − c α ) n = 0. Definition V.4.Let H D = (S, R) denote a CCRN with D constituents.Then for any finite CCRN,

Figure 11 :
Figure 11: The first few reactions in the complete 1-constituent CCRN of order two.The reactions containing clusters only up to length four (L = 4) are shown in yellow boxes.

Figure 12 :
Figure 12: Number of minimal autocatalytic subnetworks (MASs) varying with L for the complete 1-constituent CCRN of order two.For each L, the distribution of MAS with the number of reactions is also shown.Notice that total MASs for each L, plotted as black dots, increases exponentially with L.

Figure 13 :
Figure 13: The list of the FWMC classes with the information of their intersection for the complete 1-constituent CCRN with L = 4 is shown.

Figure 14 :
Figure 14: The 2-dimensional stoichiometric subspace along with the autocatalytic productive cones for the L = 3 complete 1-constituent CCRN are shown.The 6 rays emanating from the origin, labelled by their directions in the 3−dimensional space, form the edges of productive cones where one species is consumed and another is produced.For instance, the ray [−3, 0, 1] corresponds to the direction in the stoichiometric space where 1 is consumed, 3 is produced, and 2 remains constant.

z
i ≥ z r , ∀r ∈ R, (D6) i∈S: S r i <0 y i ≥ z r , ∀r ∈ R, r ≥ y i , ∀i ∈ S, (D10) r∈R S r i v r ≥ ε − ∆ i (1 − y i ), ∀i ∈ S,(D11)y i , q i ∈ {0, 1}, ∀i ∈ S, (D12) v r ∈ [0, 1], z r ∈ {0, 1}∀r ∈ R. (D13) ∈ R : z ℓ r = 1}| − 1 (D14)ℓ + 1 ← ℓ. end end Result: Set of MASs of a CRN: Mas 1 , . . ., Mas ℓ .Algorithm 1: Optimization-based approach to enumerate MASs.One of the advantages of using mathematical optimization-based approaches to determine special sets from an underlying set (as MASs from a CRN) is its flexibility to be adapted to different features.For instance, one can easily filter the obtained MASs to have at most a number of reactions, MaxReactions, involved by adding the following constraint to the model: r∈R z r ≤ MaxReactions.

Table I :
The complete 1-constituent CCRN of length L and order two is considered for L from 3 − 7. The topological properties of the CCRN, such as number of linkage classes ℓ and deficiency δ, are shown in the left half (for calculations, see Appendix B).The number of MASs for a given number of reactions is shown in the right half (for a visual representation, see Fig.12).