Amplification and errordependent decay for learning
I attempt to clarify the processes through which factors autonomously reach their target ratio without individual commands by using a simple simulation model with two factors (Table 1). Here, the nonnegative integer values of two factors, xA and xB, change by 104–105 repeats of stochastic processes of increase and decrease from 1, which is set as the initial value. The target ratio of TA: TB is set to 1: 2. In the increase process, which proceeds at a probability of αinc = 0.1, either A or B is selected, and the value of the selected factor increases by one. xA and xB stochastically decay at a probability αdec ε, where αdec = 0.1. Thus, after a decrease process, xA decays to a value selected from a binomial distribution with the number of trials xA and the probability (1 – 0.1 ε). In this text, the assumption or settings are written in present tense, whereas the results of simulation are written in past tense.
Table 1
Variables and parameters in the models
Indicator

Meaning

Values

Comments

A, B

Identifier of two factors

A: B indicates the ratio of selection in increase process

xA, xB

Value of each factor

Nonnegative integer variables

Changing by increase and decrease

TA: TB

Target ratio of each factor

1: 2 (Figure 1, except for Figure 1g after 105 repeats)

Calculated from RNAseq data (Figures 2k–5)

αinc

Probability to enter increase process at each repetition

Constant value 0.1 in Figure 1–2.
Variable (0.01–0.101) depending on the coverage of the pair in the whole in Figure 3c–f.

αinc is replaced by the Monte Carlo tree search in the model with an mRNA in Figure 4–5

αdec

Constant coefficient of decay probability

0.1

Applied in Figures 1–2

Probability to enter decrease process at each repetition

Applied in Figures 3–5

βA, βB

Bias.
In amplification, select either
at a (xA + βA): (xB + βB) ratio

1 in Figures 1e–h, 2–3, 4a–f.
10−7 in Figures 1k–m, 4g–i.

Constant value to increase additively and to avoid extinction in amplification

MSE

Mean squared error between current and target ratios

\({\left({x}_{A}/\left({x}_{A}+ {x}_{B}\right) {T}_{A}/\left({T}_{A}+ {T}_{B}\right)\right)}^{2}\)

\({= \left({x}_{B}/\left({x}_{A}+ {x}_{B}\right) {T}_{B}/\left({T}_{A}+ {T}_{B}\right)\right)}^{2}\)
Same value for A and B

ε

Error
A parameter of
decay probability

Constant (Figure 1a–c, e–f)

xA values after a decrease process is randomly selected from binomial distribution with the number of trials xA and the probability
(1 – αdec ε) in Figures 1–2 or
(1 – ε(x)) in Figures 3–5.
4step error is applied in Figure 5.

ε(x)

MSE (Figure 1d, g–m, 2b–c, i)

MSE in Figures 2–5 is rounded
to 10−1, 10−2, 10−3, …,10−i in stepwise,
to 10−1, 10−2, 10−3 in 3step error, and
to 10−1, 10−2, 10−3, 10−4 in 4step error

γ

Probability to choose additive increase among increase processes

0 or an indicated constant in range from 0 to 1

This γ is used only in Figure 1l–m.
0 in other Figures.

Initial ratio

Ratio of each factor in the total at the initial setting

Even distribution in Figures 1–2.
Expression ratio of genes from RNAseq data in Figures 3–5.

Initial value of a branch in a pair

1 in Figures 1–2.
Initial ratios are summed for genes in the branch of the pair, multiplied by the number of genes (11,281), and rounded to make an integer value, in Figure 5.

In the learning pair model, the values of two factors, xA and xB, repeat stochastic processes of increase and decrease. In the increase process, either A or B is selected, and the value of the selected factor increases by one. In the increase process, competitive amplification or additive increase is chosen at (1  γ): γ ratio. In competitive amplification, A or B is selected at the (xA + βA): (xB + βB) ratio. In the additive increase, A or B is selected at a 1: 1 ratio. In the decrease process, x decreases by decay at a probability depending on the error value ε.
In the first model, xA and xB are assumed to change at a fixed probability (Figure 1a–c). Either xA or xB is selected at a 1: 1 ratio for the increase, and the decay probability is fixed at ε = 0.1 or 0.01. Simulation results showed the similar values in xA and xB (Figure 1a). If the probability of increase in xB is twofold of that in xA (Figure 1b) or if the decay probability in xA is twofold of that in xB (Figure 1c), the xA/xB ratio approached the target ratio, 0.5. However, these conventional models with individualized probabilities require something that determines the appropriate parametersetting.
In the second model (Figure 1d), the decay probability is the same for xA and xB but changes over time, taking a value that is the mean squared error (MSE) between the current and target ratios: \({\epsilon }_{\left(x\right)}=MSE= {\left({x}_{A}/\left({x}_{A}+ {x}_{B}\right) {T}_{A}/\left({T}_{A}+ {T}_{B}\right)\right)}^{2}\). Regarding the increase, either xA or xB is selected at a 1: 1 ratio. The dynamics of xA and xB exhibited a pattern similar to predatorprey in ecology, in which the fluctuation in the number of prey xB slightly preceded that of predator xA.
In the third model (Figure 1e–f), xA and xB are assumed to increase by competitive amplification, in which either xA or xB is selected at a ratio of (xA + βA): (xB + βB) to increase by one, where bias βA = βB = 1. When the decay probability ε = 0.1 or 0.01, xA and xB fluctuated with a switching pattern in which either A or B dominated transiently (Figure 1e). When ε is as low as 0.001, the xA/xB ratio persisted at a certain value that was stochastically determined at early time points (Figure 1f).
In the fourth model (Figure 1g–h), xA and xB are assumed to increase by competitive amplification as in the third model, and to decrease by decay with a probability of MSE between the current and target ratios as in the second model. The simulation results showed that the xA/xB ratio approached the target ratio of 0.5. Some deviations observed at 104 repeats were reduced after 105 repeats of stochastic processes (Figure 1h). This model was applicable for other target ratios (Figure 1g) without tuning parameters. Repeating stochastic processes of competitive amplification and MSEdependent decay is a system that autonomously learns the target ratio through trialanderror (Figure 1i). The epigenomic regulation of chromatin modification can be interpreted as a competitive amplification process (Table 2) 12.
Table 2
Assumptions in the learning hierarchicalpair model are supported by biological knowledge.
Model assumptions

Biological findings

Regulation

Competition

A transcription factor chooses a binding locus among candidates, depending on the openness ratio of the chromatin.

Epigenomic

Amplification

Transcriptional coactivators with histone acetyltransferase activity relax the chromatin structure.
Transcription opens the chromatin, and the open chromatin structure induces transcription.

Bias (no extinction)
Additive increase

Whole genome in every somatic cell.
Conventional genetic regulation of transcription.

Genetic

Error (approximated) dependent decay

Rough evaluation of the current state.
Histone deacetylases and DNA methyltransferases close the chromatin structure.
Noncoding RNAdependent cleavage.

Dependent on cell
and environment.
Feedback from the current fitness.

Hierarchicalpair architecture

Signal transduction cascades for gene expression

Genetic

Competitive amplification

Active and expressed cascades are preferentially selected and activated.
Kinase is activated by phosphorylation at multiple sites.

Celltype dependent
Posttranslational

Errordependent decay

Dephosphorylation.
Polyubiquitin dependent degradation.

Dependent on cell and environment

Epigenetic regulations, which are highly variable depending on cell type, can be interpreted as a process of competitive amplification. Conventional genetic regulation and transcription at a fixed rate are included in the bias term. The decay rate is roughly regulated at several levels by the fitness of the current expression pattern. Hierarchical pairs are genetically determined and consistent in all celltypes. To be noticed, the target pattern is used to passively quantify the fitness of the generated current pattern.
Amplification may induce a large difference in the value of each factor by exponential growth, making a factor all or nothing. In noncompetitive amplification, in which either A or B is selected at a 1: 1 ratio and the selected term increases by xA + 1 or xB + 1, xB reached much higher value than xA (Figure 1j). When bias β in competitive amplification is not 1 but rather 10−7 (which is almost equivalent to 0 and avoids the 0/0 error in processing), xA decreased to 0 in six of the ten tests (Figure 1k and 1l left). Interestingly, the xA/xB ratio approached the target ratio in the other four tests. Competition and the addition term of bias are required to avoid extinction in amplification.
In our previously reported immune response model, three processes were assumed to occur during changes in the interaction intensity or cell number: competitive amplification (proliferation), regulated reduction (dissociation), and additive increase (migration) 13. Based on the model, a process of additive increase, in which either xA or xB is selected at a 1: 1 ratio to increases by one, is chosen at a probability γ in an increase process (Figure 1l). The condition γ = 0 is equivalent to that in Figure 1k, whereas γ = 1 is equivalent to that in Figure 1d. As γ is set to a lower value, the xA/xB ratio after 105 repeats became skewed from 1 to 0.5 (target ratio). When γ is negligibly low, xA sometimes disappeared. When the additive increase is chosen at low probabilities (γ = 0.01, 0.1), the xA/xB ratio approached the target ratio (Figure 1l–m).
The learning process can be explained as follows. While MSE and decay probability are large, the xA/xB ratio fluctuates in full range by avoiding the extinction using the bias or additive increase (Figure 1e). The xA/xB ratio is improved on average by the errordependent decay, which is a random walk with smaller stepsize as it gets closer to the target. When the xA/xB ratio approaches the target ratio, the ratio persists because A or B increases at an almost ideal ratio as set in Figure 1b. In the main simulation hereafter, competitive amplification implies selecting A or B at a (xA + 1): (xB + 1) ratio to increase by one with β = 1, γ = 0. This is designated as a learning pair process.
Hierarchical pairs and approximated MSE
To regulate gene expression, more than two factors must be controlled. When the ratios of four or eight factors are examined to be controlled by competitive amplification and MSEdependent decay, the value ratios of eight factors in a single list failed to approach the target ratios (Figure 2a–b). The eight factors can be divided into seven pairs in three layers (Figure 2c–d). The fraction of each factor in total is calculated as an infinite product of all ratios in the pairs that include the factor. When the values in each pair independently change by the stochastic learning pair process, eight factors successfully approached the target ratio after 105 repeats (Figure 2c).
Next, the required accuracy of MSE is tested because accurate detection of errors is difficult in vivo. When the MSE between the current and target ratios is accurately calculated, 64 factors approached the target ratio, which is set as a linear distribution in the range of 1–64 (correlation coefficient between the target and result ratios after 105 repeats, r, was 0.99, Figure 2e–f). As an approximation of MSE, the calculated MSE is rounded to 10−1, 10−2, 10−3, … ,10−i, where i is a natural integer, in stepwise error. When this approximated error is used, the correlation between the result and target ratios decreased but remained high (r = 0.95, Figure 2e–f). By setting a maximum value for i that indicates the lower limit of the stepwise error, five additional types of approximated MSE are compared (6, 5, 4, 3, and 2step error in Figure 2e, g). The results indicated that 3step error was required for learning (median r = 0.89) and that the stepwise error was almost equivalent to 5step error (median r = 0.95). The approximation of MSE decreased the learning accuracy, but multiple factors in the model using stepwise detection of error approached the target ratio to an acceptable level.
Next, the relationship between indexes for making pairs and targets is randomly shuffled. The ratios of each factor after 105 repeats approached the target ratios to an almost equivalent level to that without shuffling (Figure 2e, h). Furthermore, 4,096 = 212 factors approached the target that is set to shuffled values ranging from 1 to 4,096 (r = 0.97–0.98 with accurate MSE and r = 0.84–0.91 with stepwise error in five tests, Figure 2i–j).
When the gene expression data from bacteria without antibiotics (GSM2538622 RNAseq dataset) 14 are used as the target ratio, the ratio of 4,096 factors changed from the initial evendistribution to the expression pattern after 105 repeats of stochastic processes with stepwise error (r = 0.98, Figure 2k). Subsequently, when the target ratios are reset to the gene expression pattern observed in the presence of antibiotics (Figure 2l) 14, the ratio of 4,096 factors changed from the pattern without antibiotics to the new target pattern (Figure 2m). Thus, bacteria may autonomously produce proper gene expression patterns by reducing error caused by antibiotics.
Hierarchical clustering of human genes
I next apply the learning hierarchicalpair process to human gene expression. In advance, it is necessary to set the genes that are paired. Six hierarchical clustering analysis
methods (Figure 3a–c), which are Ward, WCO, Single, and three newlydeveloped methods (AreaSum, CvSum and Cvarea), are applied to a total of 16,921 genes in 20 differently labeled cells from preimplantation human embryos, human embryonic stem cells, and downstream early mesoderm and endoderm progenitors (scRNAseq datasets EMTAB3929, GSM2257302, and GSE75748) 15–17. The number of layers in hierarchical pairs generated by the AreaSum method was the smallest (27), whereas that by the Single method was largest (10,796) (Figure 3c). In the AreaSum method, the area formed by two vectors from the origin is calculated as the distance between two genes and the total gene expression level is used as the representative value of the cluster (Figure 3a–b).
As another modification, the probability of entering a process of competitive amplification, αinc, is set as a variable in the range of 0.001–0.101 depending on the coverage of the pair. This assumption is further modified in another model with an mRNA pool. For test data, another scRNAseq dataset from human preimplantation embryos (GSE36552) is used 18. The initial and target ratios in each pair are set with the data of a zygote and a cell at 4cell stage, respectively. The correlation coefficient between the initial and target ratios is a median r = 0.78 (range 0.67–0.84) in 12 tests (Figure 3d). For each pair, the stochastic processes of competitive amplification and decay using the stepwiseapproximated MSE are repeated 105 times.
The learning efficiency was compared among the six different hierarchicalpair architectures. The expression ratio most closely approached the target ratio when hierarchical pairs generated by the AreaSum method are used (r = 0.98, Figure 3c–d), with even a closer correlation than another 4cell data in scRNAseq (Figure 3e) 18. In pairs generated by the Single method, the expression ratio did not approach the target ratio (Figure 3c). These results indicate that the architecture of hierarchical pairs affects the ability to approach the target ratio. In contrast, even when the initial and target patterns are independently shuffled to test noncorrelated artificial patterns, the expression ratio approached the target ratio (median r = 0.98, range 0.94–0.99 in six tests) in the hierarchical pairs generated by the AreaSum method (Figure 3f). Owing to the high adaptability of this learning process, it was difficult to validate the accuracy of gene pairing.
A model with a signal transduction cascade and an mRNA pool
I assume that the hierarchicalpair architecture is a signal transduction cascade to select a gene for transcription in a model with an mRNA pool. Rather than using parameter αinc, a pair is stochastically selected at each repetition among pairs in the top seven layers depending on the coverage of the pair. In the selected pair, the competitive amplification is performed; branch A or B is selected at a ratio (xA + β): (xB + β), where β = 1, and the value of the selected branch, xA or xB, increases by one. Additionally, the downstream pair of the selected branch enters the process of competitive amplification until the selected branch is a leaf indicating a single gene. In an mRNA pool, mRNA of the selected gene increases by one, with randomly replacing one mRNA. Initially, 360,000 mRNAs in the mRNA pool are set based on the initial ratio (zygote). In addition to the mRNA, the expression probability is calculated as an infinite product of ratios in pairs including the gene, which is equivalent to the expression ratio in the previous model without an mRNA pool. The ratios of mRNA and expressionprobability approached the target ratio (4cell) after 5 × 105 repeats (r = 0.95–0.97 and r = 0.97–0.99, respectively, in six tests), although genes with 0 to several mRNAs were plotted discretely in mRNA ratios (Figure 4a). Furthermore, even when decay probability or MSE is approximated to three different values, 0.1, 0.01, and 0.001 (3step error as in Figure 2e) similar changes approaching the target ratio were observed with setting 4cell as the targets (r = 0.94–0.98 in 12 tests, Figure 4b) and with shuffling the targets (r = 0.92–0.94 in six test, Figure 4c).
To analyze the dynamics in the simulation, when the initial and target ratios are set with zygote and blastocyst data, the mRNA ratio gradually approached the target blastocyst pattern over 106 repeats, but not via the patterns of the 2cell, 8cell, or morula stages (Figure 4d). The expression probability more quickly and directly reached near the target ratio within 104 repeats (Figure 4e), and then the similar correlation levels persisted during 104–106 repeats. The mRNA levels of each gene approached the target level with fluctuations (Figure 4f). In a simulation, the dynamics of GATA3 were more similar to those of GATA2 than to those of DAB2, although the initial and target values of GATA3 and DAB2 are closer than those of GATA2. The higher correlation during stochastic fluctuations is explained by the hierarchicalpair architecture, where GATA2 and GATA3 are paired in the 7th layer from the top, whereas they are separated from DAB2 in the 2nd layer.
For gene regulation during homeostatic state, the bias term β may not be required because MSE or decay rate can be kept low. When both the initial and target ratios are set with the same 4cell data, the mRNA ratio deviated from the pattern during 5 × 105 repeats in the model with β = 10−7 and 3step error (r = 0.66–0.88 in six tests, Figure 4g). In contrast, the mRNA ratio maintained the set pattern in the model with 4step error at least for 5 × 105 repeats (r = 0.98–0.99 in six tests), while the correlation between expression probability and the target ratio gradually decreased (r = 0.76–0.87, Figure 4h). When the initial state is set with a zygote and the target ratio is set with scRNAseq data of 2cell stage, which has a highlysimilar expression pattern to a zygote (r = 0.94–0.96) 18, the mRNA ratios approached the target ratio, except for one case in six tests (median r = 0.98, range 0.51–0.98, Figure 4i). However, the change from a zygote to the 4cell stage was poorly reproducible in the model with β = 10−7 (median r = 0.88, range 0.85–0.94 in six tests). In the absence of a bias term or additive increase, a homeostatic state with a similar expression pattern was maintained while allowing some limited changes in differentiation.
A common model for human gene expression
Based on these findings, I propose that a single model can control whole gene expression during any differentiation processes in human cells and evaluate this in early embryogenesis and hematopoiesis. To generate hierarchical pairs, I collect 13 scRNAseq datasets from human tissues 15–17,19−28, in which 11,281 gene names were commonly labeled in 11,803 cells. Using the relative expression ratio of these 11,281 genes in each cell, a hierarchicalpair architecture was generated using the AreaSum clustering method. This architecture contained 11,280 pairs in 22 layers (Supplementary Table 1).
The model with an mRNA pool, 4step approximated error, and this hierarchicalpair architecture is applied to the regulation of 11,281 genes, setting bias β = 10−7 or 1 depending on the situation. When the initial state of pairs and mRNA pool is set with a zygote scRNAseq data and the target ratio is changed in the order of zygote, 2cell, 4cell, 8cell, morula, and blastocyst stages every 5 × 105 repeats, the ratio in the mRNA pool dynamically approached the target ratios until the 4cell stage in the model with β = 10−7 (Figure 5a–b). When the model with β = 1 is applied after 1.5 × 106 repeats, the gene expression patterns sequentially approached the 4cell, 8cell, morula, and blastocyst patterns with a correlation coefficient of more than 0.95 at the peaks (Figure 5c–d).
In hematopoiesis, multilymphoid progenitors (MLPs) differentiate into B cells or T cells in peripheral blood mononuclear cells (PBMCs), whereas granulocytemacrophage progenitors (GMPs) differentiate into myeloid cells 29. When the initial state and target ratio are set with a progenitor, the mRNA ratios were maintained during 5 × 105 repeats in the model with β = 10−7 (r = 0.97–1.0 in six tests, Figure 5e–g). The expression patterns in progenitors are largely different from those in PBMCs 30 (median r = 0.29, range 0.15–0.57 in 9 tests, Figure 5h). When the target ratio is changed to a PBMC pattern and β is set to 1, the mRNA ratio approached the target ratio during the next 5 × 105 repeats (r = 0.86–0.97), with more rapid adaptation in the expression probability (Figure 5e–f, i). The mRNA ratio, but not the expression probability, further approached the target ratio during the following 5 × 105 repeats in the model with β = 10−7 (r = 0.97–0.99, Figure 5e–f, j). These results demonstrate that the learning hierarchicalpair model using one common architecture can reproduce various differentiations and notimmortal homeostasis by adding bias terms in the former.