Singlecell RNASeq datasets
We here analyzed the following scRNASeq datasets:
Tabula Muris Senis (TMS): This mouse scRNASeq dataset [15] encompasses many different tissuetypes with samples collected at 6 different ages: 1, 3, 18, 21, 24 and 30 months. Data object files were downloaded from figshare https://doi.org/10.6084/m9.figshare.8273102.v2
We used the normalized data as provided in the h5ad files. We focused on the lungtissue 10X dataset, because it contained one of the largest numbers of immunecell subtypes with good representation across agegroups including multiple mouse replicates.
Colon cancer development: This is a human snRNASeq dataset [18] encompassing colon samples collected from healthy individuals, normal samples from unaffected individuals with FAP, polyps from FAP and nonFAP cases, and colorectal adenomas. We analysed both the processed snRNAseq data available from GitHub (https://github.com/winstonbecker/scCRC_continuum), as well as the full unprocessed data available from GEO (GSE201348). Processed data were stored as Seurat objects that included donor, disease stage and celltype annotation information.
The EVA algorithm
Here we describe the recently published algorithm (EVA: Extended LouVain Algorithm) to identify homogeneous communities in a network with node attributes [26]. Let \(G=(V, E, A)\) denote a graph where \(V\), \(E\) and \(A\) are the set of vertices, edges and node (cell) attributes, respectively. Node/cell attributes can be categorical or numerical such that \(A\left(v\right)\), with \(v\in V\), identifies the set of cell attribute values associated with cell \(v\). In our applications, we will mostly consider one cell attribute (typically the cell’s age or perturbation state) but below we formulate the model for any number of node attributes. With EVA, the goal is to identify a network partition, i.e. a mutually exclusive set of communities/clusters, that maximizes a topological clustering criterion as well as node label homogeneity within each community. Thus, the measure we wish to maximize consists of two components: the modularity Q that measures the extent to which the partitioning captures clusters of highedge density, and a purity index P that measures the homogeneity of the communities in relation to node attributes.
In more detail, the modularity Q of a partition quantifies the edge density within communities relative to that expected under an appropriate null distribution [27], and is defined by
$$Q=\frac{1}{2m}{\sum }_{vw }\left[{A}_{vw}\gamma \frac{{k}_{v}{k}_{w}}{2m}\right]\delta \left({c}_{v, }{c}_{w}\right) \left(1\right)$$
where m is the number of edges, \({A}_{vw}\) represents the edge weight between nodes \(v\) and \(w\), \({k}_{v}=\sum _{w}{A}_{vw}\) and \({k}_{w}=\sum _{v}{A}_{vw}\) are the sum of weights of the edges linked to nodes \(v\) and \(w\), respectively. The Kronecker \(\delta \left({c}_{v, }{c}_{w}\right)\) function is 1 when nodes \(v\) and \(w\) are in the same community (\(c\)) and 0 otherwise. The resolution parameter \(\gamma\), which typically takes values in the range \(0<\gamma \le 1 ,\)controls the number and size of inferred communities with higher resolution values leading to a greater number of smaller communities. For \(\gamma =1\), Q can take values between 0 and 0.5 [12].
The purity index is defined by the average node label homogeneity (i.e. purity) over all inferred communities [26]:
$$P=\frac{1}{\leftc\right}\sum _{c\in C}{P}_{c} \left(2\right)$$
where \({P}_{c}\) represents the purity of community \(c\). The purity of a given community (\(c\)) is defined as the product of frequencies of the most frequent node attribute values within the\(c\)
$${P}_{c}=\prod _{A}\frac{{max}_{a}\left(\sum _{v\in c}a\left(v\right)\right)}{\leftc\right} \left(3\right)$$
where \(A\) is the set of node attributes, and \(a\left(v\right)\) is an indicator function for node \(v\) taking value 1 if \(a=A\left(v\right)\), 0 otherwise. \({P}_{c}\) attains maximum purity 1 when all nodes in the community (\(c\)) have same attribute value sequence. In the case of just one cell attribute, this corresponds to the case where all cells in the community have the same attribute value. Of note, P takes values in the range 0 to 1.
Finally, the EVA algorithm is defined by the optimization of a generalized modularity function Z
$$Z=\alpha P+\left(1\alpha \right)Q \left(4\right)$$
where \(\alpha\) is the purity index parameter taking values also in the range 0 to 1. Of note, for \(\alpha =0\), Z = Q and we recover the Louvain algorithm if \(\gamma =1\) [13] (or a modified Louvain algorithm if \(\gamma <1)\). At the other extreme (\(\alpha =1)\), Z = P, and the clustering algorithm only cares about maximizing purity subject to network connectivity constraints. For a given \(\alpha\) and \(\gamma\), we optimize Z following the algorithmic implementation of Citraro and Rossetti [12].
The ELVAR algorithm
Building on EVA, we developed an algorithm and Rpackage called ELVAR for differential abundance (DA) testing in scRNASeq data. Specifically, the biological question being addressed by ELVAR is whether the proportion of a given cellstate (or cell subtype) changes in relation to some other factor or cell attribute (e.g. age, disease stage). Given the potentially high technical and biological variability, such DAtesting should ideally be carried out in scenarios where multiple replicates are available [4]. ELVAR is designed for complex scenarios where (i) the source of variation associated with the cellular states or subtypes is relatively small and (ii) where sample replicates are available. The ELVAR algorithm consists of 4 main steps that we now describe:

Construction of the k nearestneighbor (knn) cellcell graph: As input, the EVA algorithm requires a connected nearest neighbor cellcell graph, as generated for instance using Seurat’s FindNeighbors function. The number of nearest neighbors k should be chosen sensibly in relation to the total number of cells ntot. For instance, we aim for a ratio ntot/k ~ 50, so that for a 1000 cell graph, the number of neighbors is 20. To run EVA, the input graph must also have at least one vertex/node (i.e. cell) attribute, which will be used when inferring communities in the graph. In our applications this main cell attribute will be age or disease status. In general, cells also have other attributes besides the one being used in the communityinference process. For instance, these additional attributes could be the sample replicate (individual/mouse) from which the cell derives, or a particular cellstate or subtype. To avoid confusion, we will call the attribute used in the clustering or community inference as the “main attribute”, whilst the attribute being interrogated in DAtesting as the secondary attribute.

Selection of purity index parameter a: Another important input to the EVA algorithm is the value of the purity index parameter a (called \(\alpha\) in previous section), which controls the relative importance of purity P over modularity Q when optimizing Z. Typically, we recommend running the EVA algorithm 50 to 100 times for different a parameter values ranging from 0.1 to 0.9 (the extremes a = 0 and a = 1 are not interesting), in order to assess how P, Q and Z vary as a function of a. It is also important to record the number of inferred communities, as this also depends on the value of a. Following the recommendation of Citraro et al [12], we choose a such that when compared to Louvain (a = 0) there is a clear increase in the purity P without much degradation to the modularity Q. Ideally, the number of inferred communities must also be higher compared to Louvain, although this is not absolutely necessary. Thus, although for a fixed a EVA works by optimizing Z, we do not choose the a value which maximizes Z, because for most realworld networks, Z will increase with a and will be maximal when a = 1. For typical knn cellcell networks, the optimal a value is generally in the range 0.7 to 0.9. As far as the resolution parameter is concerned, we generally consider \(\gamma =1\), as this allows direct benchmarking to the original Louvain algorithm which is also defined for \(\gamma =1\).

Inference of enriched communities with EVA: Having inferred the optimal parameter a value, we now rerun EVA with this input parameter a value, to infer communities. The next step is to then identify those communities that are enriched for specific main attribute values. This is done for each main attribute value in turn, using a Binomial test with a stringent Bonferroniadjusted Pvalue < 0.05/(number of communities * number of attribute values) threshold. Only communities enriched for specific attribute values are taken forward for further analysis. Thus, the purpose of this important step is to remove cells that don’t clearly define cellstates associated with the attribute value. Having found all enriched communities for a given attribute value, we then group together all cells from these enriched communities to define a cellgroup per attribute value. The cells within a cellgroup may derive from distinct sample replicates.

DA testing for secondary attributes: For each cell group associated with a main attribute value v, we next count the number of cells with any given secondary attribute value contributed by each replicate r, whilst also recording the total number of cells contributed by that sample replicate. For each secondary attribute value, we then run a negative binomial regression (NBR) of the cell count against the main attribute value (here assumed ordinal e.g. agegroup or disease stage) with the sample replicate’s total cell count being the normalization factor. Mathematically, we model the cell count \({n}_{st}\) of secondary attribute value t and sample s, as a negative binomial (NB)
$${n}_{st}∼NB({\mu }_{st},{\phi }_{t})$$
where \({\mu }_{st}\) is the mean number and \({\phi }_{t}\) is the dispersion parameter. We further assume that \({\mu }_{st}={\mu }_{vrt}={f}_{vt}{n}_{vr}\), where \({f}_{vt}\) is the fraction of cells of type t when they have main attribute value v, and where \({n}_{vr}\) is the total number of cells derived from sample s (which has main attribute value v and r is the replication index). We next assume that the log of \({f}_{vt}\) is a linear function of v, so that the final regression is of the form
$$\text{log}{\mu }_{vrt}={\alpha }_{t}+{\beta }_{t}\text{log}{n}_{vr}+{\gamma }_{t}{v}_{vr}$$
$$\text{log}{\mu }_{st}={{\alpha }_{t}+\beta }_{t}\text{log}{n}_{s}+{\gamma }_{t}{v}_{s}$$
Thus, given the cell counts \({n}_{st}\) we simply run a negative binomial regression (NBR) against the two covariates \({v}_{s}\) and \(\text{log}{n}_{s}\), to find out if the cell counts vary significantly with the main attribute value v. The covariate \(\text{log}{n}_{s}\) plays the role of a normalization factor, accounting for the total of number of cells contributed by sample s. Wald zstatistics and Pvalues of association are obtained from this NBR.
Benchmarking against Louvain is done by direct comparison of these statistics. Because the original Louvain algorithm is deterministic, whilst ELVAR is not (in ELVAR optimization is performed in a nonsequential random manner), for benchmarking we perform 100 distinct ELVAR runs, comparing the distribution of Waldtest statistics to the Louvainderived one with a onesided Wilcoxon rank sum test. Of note, further below we also describe how we benchmark ELVAR to a nonsequential randomized version of Louvain, where the Louvain output may also differ between runs.
ELVAR pseudocode
Below is an outline of pseudocode for running ELVAR. We assume the scRNASeq data is encoded in a Seurat object seu.o with all the required metainformation, including a main cell attribute to be used in the clustering (clustering attribute “CA”), and a secondary cell attribute for DAtesting that we call “SA”. The first step is to normalize the data and to build the knearest neighbor cellcell graph:
Step1 (Normalization and construction of cellcell graph):

seu.o < FindVariableFeatures(seu.o,selection.method=”vst”);

seu.o < ScaleData(seu.o,features = rownames(seu.o));

seu.o < RunPCA(seu.o);

Elbowplot(seu.o); ### to determine number of significant PCs: topPC
The choice of k in specifying the number of nearest neighbors should be chosen sensibly. Typically the ratio of number of cells/k should be around 50, so assuming 1000 cells, k should be around 20:

seu.o < FindNeighbors(seu.o,dims = 1:topPC,k.param = 20);

adj.m < as.matrix(seu.o@graphs$RNA_nn); diag(adj.m) < 0;

gr.o < graph.adjacency(adj.m,mode=”undirected”);

vertexN.v < names(V(gr.o));

vertex_attr(gr.o,name=”CA”) < [email protected]$CA;

is.connected(gr.o); ### check graph is connected (if not, incrementally increase k).
Step2 (Estimate optimal purity index parameter a):
Step3 (Inference of enriched communities with EVA):

eva.o < Eva_partitions(gr.o,alpha = aOPT,Vattr.name=”CA”);

comm.o < ProcessEVA(eva.o,seu.o);
Step4 (Do DAtesting of a secondary attribute with negative binomial regressions):
The object nbr.o is typically the output of the glm.nb function from the MASS Rpackage, and statistics of association between the SA (e.g. Cd4t activation status) and CA (e.g. age) attributes can be extracted using summary(nbr.o)$coeff .
Louvain with random nonsequential node selection (LouvainRND)
Because EVA is a natural extension of Louvain, it is natural to benchmark it against the ordinary Louvain algorithm as implemented by Blondel et al [13], which is also implemented in the igraph R package. The ordinary Louvain algorithm gives a deterministic network partition output, i.e. every run of the Louvain algorithm results in the same partition. This is because during an optimization run nodes are visited and assessed for local moving sequentially. On the other hand, EVA builds upon algorithmic details implemented in the Leiden algorithm [28], which results in potentially different partitions every time EVA is run. Specifically, as with the Leiden algorithm, during an optimization run in EVA, we consider random (nonsequential) selection of nodes, which can therefore result in a different partition every time EVA is run. Thus, when benchmarking EVA against Louvain, it is important to account for these implementation differences. We do this by also benchmarking EVA against a modified version of Louvain (called LouvainRND), where during an optimization run, nodes are visited randomly, which may also result in different partitions for different runs.
Simulation Models
The analysis benchmarking EVA against Louvain, was performed in the context of two simulation models. In the first case, we selected 200 classical monocyte cells from the TMS lung tissue scRNASeq 10X dataset, ensuring all cells derive from the same mouse (mouseID = 19) and thus from the same age. For 100 of these cells, we then modified their scRNASeq profiles, simulating a “perturbed” cellstate, as follows. We randomly selected 50 genes among all genes not expressed in any of the 200 cells. For each perturbed cell, we then randomly subselected 20 genes from these 50, whose values were then altered in the cell, by randomly drawing 20 nonzero expression values from the distribution of nonzero expression values of the whole data matrix. Thus, this procedure generates a weak but significant coexpression structure among the 100 perturbed cells. Seurat was then applied to the 20138 gene x 200 cell scRNASeq data matrix, with VST feature selection followed by PCA. Top8 PCs were selected to build the knearest neighbor cellcell graph using k = 6. Louvain clustering algorithm as implemented in igraph was used to infer communities. EVA was run on the same cellcell graph using a cell’s perturbation state as cellattribute. Since the EVA result depends on initialization, we performed a total of 100 runs for each of nine choices of purity index parameter a (a = 0.1, 0.2, …, 0.8, 0.9). The final value of a was chosen heuristically as the value at which purity increased compared to the Louvain solution (a = 0) without compromising modularity too much. The quality of the EVA and Louvain clustering was assessed using the Adjusted Rand Index against the cell’s perturbation state, as well as using ChiSquare statistics.
For the second simulation model, we selected all classical monocytes from mice with mouseIDs 0 and 1 representing 1 month old mice (201 & 284 cells), mouse IDs 2 and 3 representing 3 month old mice (51 & 60 cells), mouse IDs 13 and 14 representing 21 month old mice (104 & 138 cells) and mouse IDs 21 and 22 representing 30 month old mice (94 & 61 cells). For young mice (1 and 3 month old mice), 25% of cells were perturbed using the same procedure described previously. For old mice (21 and 30 months), the frequency of perturbed cells was increased to 50%. Thus, this model simulates an agerelated increased in a perturbed state. The cellcell graph was derived as before, this time using the top 10 PCs and k = 20, i.e k was increased in line with the larger number of cells (n = 993). Louvain and EVA were run on this cellcell graph, in this case using age as the cellattribute information. As before, EVA was initially run a 100 times for each of nine choices of a parameter, in order to select an optimal a based on overall purity and modularity values. Using the optimal a value, EVA was then compared to Louvain in its ability to predict the increased frequency of the perturbed cellstate in the older mice.
Application of ELVAR to detecting shifts in lung tissue Cd4 + Tcell subtypes
As part of the 10X lungtissue TMS set, a total of 551 Cd4 + Tcells were profiled to allow testing of a shift in naïve to mature subtypes with age. We removed cells from 4 mice each contributing less than 10 cells, leaving a total of 537 cells from 11 mice representing five agegroups: 143 (1m), 122 (3m), 67 (18m), 107 (21m) and 98 (30m). Cells expressing Lef1, a wellknown marker of naïve Cd4 + Tcells [16], were defined as naïve (n = 186), the rest as mature (n = 351). ELVAR was then applied to determine if the naïve/mature proportions change with age. Cellcell graph was constructed using Seurat with VST and 8 top PCs and k = 10. EVA was run a total of 100 times with a = 0.8 (the optimal value in this dataset).
Application of ELVAR to M1/M2 polarization analysis in lung alveolar macrophages
As part of the 10X lungtissue TMS set, lung alveolar macrophages were abundantly profiled (n = 1261) to allow testing of a shift in M1/M2 macrophage polarization with age. We removed cells from mice displaying batch effects, leaving a total of 1124 cells from 15 mice representing five agegroups: 517 (1m), 184 (3m), 193 (18m), 91 (21m) and 139 (30m). In order to annotate these 1124 lung alveolar macrophages into M1/M2 subtypes, we first identified 5 robust murine M1 (Cd80, Cd86, Fpr2, Tlr2, Cd40) and 5 robust M2 markers (Egr2, Myc, Arg1, Mrc1, Cd163) from the literature [29]. In an initial annotation, we declared cells as M1 if they coexpressed at least 2 of the 5 M1 markers, and similarly for M2. Cells annotated to both M1 and M2 subtypes were reassigned an undetermined (UD) category alongside all other cells not annotated to either M1 and M2, resulting in 308 M1, 195 M2 and 621 UDcells. We reasoned that UDcells clustering predominantly with either M1 or M2 cells could be reassigned to M1/M2 subtypes. To this end, we developed an iterative algorithm that reassigns the status of UDcells to either M1 and M2, depending on their relative proportions among the neighbors of a given UDcell. In more detail, we used the cellcell graph as inferred using Seurat, and a multinomial test with P < 0.05 threshold to identify the UDcells whose polarization status could be reassigned to either M1 or M2 status. We also required the absolute difference between the proportion of M1 and M2 neighbors of a given UDcell to be larger than 0.2. This procedure was iterated 20 times, but numbers already converged after 7 iterations, resulting in 464 M1, 214 M2 and 446 UDcells. ELVAR was then applied to determine if the M1/M2 proportions change with age. Cellcell graph was constructed using Seurat with VST and 9 top PCs and k = 20. EVA was run a total of 100 times with a = 0.7 (the optimal value in this dataset).
Application of ELVAR to colon cancer progression
We applied ELVAR to explore if the fractions of epithelial stemcells and Tregulatory cells changes with disease progression. The analysis was performed in two ways. In the first approach, we downloaded Seurat objects from https://github.com/winstonbecker/scCRC_continuum which contain QCprocessed data and celltype annotations for a subset of samples. In the case of epithelial cells, the analysis was performed on a subset of the data consisting of stemcells, TA2 & TA1 transit amplifying progenitors, enterocyte progenitors, immature enterocytes and differentiated enterocytes cell states. We randomly picked 1000 cells from each cell state (thus a total of 6000 cells) in order to reduce the computational runtime because two cell states contained ≥ 30k cells. Next, we removed cells from 2 donors each contributing less than 10 cells and cells from 1 donor displaying a batch effect, leaving a total of 5810 cells from 11 donors representing four disease stages: 1153 (Normal), 1672 (Unaffected), 2911 (Polyp) and 74 (Adenocarcinoma). The distribution of cellstates was 843 stemcells, 971 TA2, 1000 TA1, 999 enterocyte progenitors, 998 immature enterocytes and 999 enterocytes number of cells. A cellcell graph was constructed using Seurat with variance stabilization for feature selection, and selecting the 15 top PCs with k = 20. ELVAR was run a total of 100 times with a = 0.8 and disease stage as the main cellattribute for community detection. For the analysis of Tregulatory (Tregs) cells we focused on the subset of data consisting of Tregs, NK, Naïve T, CD4 + and CD8 + cells. We removed cells from 3 donors each contributing less than 10 cells and cells from 1 donor displaying a batch effect, resulting in a total of 6171 cells from 8 donors representing four disease stage groups: 381 (Normal), 3721 (Unaffected), 1900 (Polyp) and 169 (Adenocarcinoma). The distribution of cells across celltypes was: 472 Tregs, 245 NK, 1042 Naïve T, 3063 CD4 + and 1349 CD8 + number of cells. The cellcell similarity graph was constructed using Seurat with variance stabilization for feature selection, selecting the 15 top PCs and number of nearest neighbors k = 20. ELVAR was run a total of 100 times with a = 0.8 with disease stage as the main clustering attribute.
In the second approach, we downloaded the raw count snRNASeq matrices from GEO (GSE201348). Data was normalized using Seurat with variance stabilization for feature selection, leaving a total of 380,527 cells. Because the celltype annotation for the full dataset was not provided, we applied dimensional reduction, clustering, UMAP visualization and wellknown marker genes from Becker et al [18] to the cells from the normal samples only, to annotate well separated cell clusters into enterocyte, goblet, immunecell, stromal and endothelial cell categories. We then used Wilcoxon tests and markerspecificity scores [20, 21] to build an mRNA expression reference matrix for these 5 broad cell categories. With this mRNA expression reference matrix in place, we then used our robust partial correlation framework [21] to estimate celltype probabilities for all cells from all disease stages. Using a probability threshold of > 0.7, we were thus able to confidently annotate 1866 endothelial cells, 104009 enterocyte cells, 78421 goblet cells, 24973 immunecells and 7941 stromal cells. Because of the very high sparsity of the snRNASeq data, in order to confidently identify stemlike cells among the 104,009 enterocytes, we applied our validated CancerStemID algorithm [22, 23] which approximates stemness of singlecells from the estimated differentiation activities of tissuespecific transcription factors. In this instance, we used a set of 56 colonspecific TFs and their associated regulons, already validated by us previously [23]. The regulons were applied to the snRNASeq data, to estimate transcription factor differentiation activity (TFA) for each of the 56 TFs across each of the 104,009 enterocyte cells. We then declared stemcell like cells as those displaying average TFA levels over the 56 TFs less than a threshold given by the 5% quantile of the average TFA distribution defined over the normal cells only. For ELVAR analysis, we only retained samples contributing at least 50 cells. For all other samples, all cells up to a maximum of 500 randomly selected cells were chosen, resulting in a total of 31,385 cells, drawn from 69 samples (8 normal, 16 unaffected FAPs, 41 polyps and 4 CRCs), encompassing 3761 normal, 7443 unaffected, 18558 polyp and 1623 CRC cells. The cellcell k = 50 nearest neighbor graph was constructed using Seurat, and ELVAR run a 100 times with a = 0.8, using disease stage as main cell attribute.
Comparison of ELVAR to DAseq and MiloR
We compare the three algorithms in their ability to detect proportions of cellstates as a function of a main cellattribute. In our context this main cellattribute could be age or diseasestatus, and cellstates could refer to Cd4t activation status, or differentiation stage. What the three algorithms have in common is the inference of groups of cells that display differential abundance relative to the main cellattribute. The methods differ in how these groups of cells are inferred. In ELVAR, we use the main cellattribute information when inferring cellular communities from the nearest neighbor cellcell graph, subsequently identifying those that display enrichment for any specific attribute value (e.g. agegroup). In contrast, DAseq [7] and MiloR [4] infer local regions, or potentially overlapping cellular neighborhoods, displaying significant DA of agegroups. Thus, one way to compare all three algorithms is by selecting the cells that appear in these significant communities/regions/neighborhoods, and subsequently running negative binomial regressions of cellstate counts vs agegroup, taking biological replicates into account and normalizing for the total number of cells that each replicate sample contributes to each agegroup, as described earlier for ELVAR.
To understand the difference in performance between methods, we developed the following metric. Methods may display different sensitivity to detect DA of a secondary attribute relative to the main one because the significantly associated cell groups derived from each method (i.e. agegroup enriched communities in the case of ELVAR, ageassociated neighborhoods/regions in the case of MiloR/DAseq) may capture different numbers of cells. To make this clear, consider a scenario where one of the methods (call it “X”) can’t detect a cell group with sufficient numbers of old cells, say it detects a cellgroup with at most 10 old cells, with 6 of these belonging to one cellstate “A”, with the remaining 4 belonging to another cellstate “B”. In contrast, another method “Y” does infer a large enough cellgroup consisting of oldcells, say 30 cells with 15 belong to state “A” and 15 belonging to state “B”. As far as young cells are concerned, all methods are able to infer a cellgroup with a considerable number of cells, say 50 young cells, with 40 belonging to state “A” and 10 belonging to state “B”. Because method "X” was not able to identify a cellgroup with sufficient numbers of oldcells, it lacks power to detect the relative decrease of state “A” with age (twotailed Fishertest P = 0.22), whilst method “Y” has the power to detect it (twotailed Fishertest, P = 0.006). Whilst this hypothetical example ignores the variation due to sample replicates or variations due to replicate cell numbers, it clearly illustrates that the fraction of captured cells (fCaptCells) per main attribute value will strongly influence a method’s power to detect DA of an underlying cellstate with respect to this main cell attribute. Mathematically, we define the fraction of captured cells per attribute value a and from method m by:
$${fCaptCells}_{ma}=\frac{\left\left(\#Cells with attribute value=a\right)\cap (\#Cells in groups from method m)\right}{(\#Cells with attribute value=a)}$$
Specifically, a method that attains higher fCaptCellsma across the whole range of attribute values a, including the extremes if the attribute is ordinal, will display higher power.