Real-Time Online Unsupervised Domain Adaptation for Real-World Person Re-identification

Following the popularity of Unsupervised Domain Adaptation (UDA) in person re-identification, the recently proposed setting of Online Unsupervised Domain Adaptation (OUDA) attempts to bridge the gap towards practical applications by introducing a consideration of streaming data. However, this still falls short of truly representing real-world applications. This paper defines the setting of Real-world Real-time Online Unsupervised Domain Adaptation (R$^2$OUDA) for Person Re-identification. The R$^2$OUDA setting sets the stage for true real-world real-time OUDA, bringing to light four major limitations found in real-world applications that are often neglected in current research: system generated person images, subset distribution selection, time-based data stream segmentation, and a segment-based time constraint. To address all aspects of this new R$^2$OUDA setting, this paper further proposes Real-World Real-Time Online Streaming Mutual Mean-Teaching (R$^2$MMT), a novel multi-camera system for real-world person re-identification. Taking a popular person re-identification dataset, R$^2$MMT was used to construct over 100 data subsets and train more than 3000 models, exploring the breadth of the R$^2$OUDA setting to understand the training time and accuracy trade-offs and limitations for real-world applications. R$^2$MMT, a real-world system able to respect the strict constraints of the proposed R$^2$OUDA setting, achieves accuracies within 0.1% of comparable OUDA methods that cannot be applied directly to real-world applications.


Introduction
Person re-identification (ReID) is the task of matching a person in an image with other instances of that person in other images, either from the same camera or a different one.More specifically, it is associating a person's query with its match in a gallery of persons [46].Person ReID is a common task in many real-world applications.Such applications include video surveillance (e.g.determining when unauthorized people are present in an area), public safety (e.g.understanding pedestrian motion to avoid accidents), and smart health (e.g.mobility assessment and fall detection for seniors needing assistance).Thus, achieving accurate and robust person ReID for any environment is an important research goal for the community.
Many methods have been developed for person ReID [18,41,52,53], and many high quality datasets have been created for the task [25,35,42,49,50].Deep learning approaches have been able to achieve incredible accuracies, nearly reaching saturation in some cases [32,40,43,55].However, person ReID is a highly contextspecific task, and models trained on one dataset often fail to perform well on others [46] Domain Adaptation (UDA) has been studied to combat this domain shift [2,8,28,36,42,46].In UDA, initial training is performed on the labeled data of the source domain, and then inference is done in a different target domain.UDA methods generally achieve lower accuracies than State-of-the-Art (SotA) deep learning approaches that train directly on the target domain.
One common thread among these approaches is the reliance on having the entirety of the target domain available at training time.While this is convenient for research, many practical applications do not have unrestricted access to the entire target domain.Recently, [33] introduced the setting of Online Unsupervised Domain Adaptation (OUDA).OUDA specifies that data from the target domain can only be accessed through a data stream, bringing research more in line with realworld applications.OUDA adopts a batch-based relaxation [9] where different identities are separated among batches to simulate streaming data.OUDA also argues that confidentiality regulations make it such that many real-world applications can only store data for a limited amount of time, applying a restriction that image data cannot be stored beyond the batch in which it was collected.
Table 1 shows the challenges of real-world applications, and how UDA and OUDA fail to fully address them.Like UDA before it, OUDA uses hand-crafted person ReID datasets for the target domain.Not only is the data stream only simulated, but the provided person images were hand selected by the creators of the dataset.In a real-world system, person images need to be generated by the system itself, creating a layer of noise not present in hand-crafted datasets.Further, by using hand-crafted datasets, the distribution of person images is guaranteed to be suitable for training.Specifically, most person ReID dataset tend to have a fairly uniform distribution, having around the same number of person images for each identity [27].However, in realworld applications, there is no guarantee that person images generated from streaming data will form a uniform distribution in identities.There is also no guaran-tee that every identity in the dataset will be available for training.
To bring the field closer to the real-world, this paper proposes Real-World Real-Time Online Unsupervised Domain Adaptation (R 2 OUDA), a setting designed to address the challenges found in real-world applications, as seen in Table 1.R 2 OUDA defines four major considerations beyond the OUDA setting needed to develop systems for the real world.First, R 2 OUDA considers that person images must be generated algorithmically from streaming data.Second, the distribution of data to be used in training must also be determined algorithmically.Third, R 2 OUDA expands the batchedbased relaxation [9] of online learning to use time segments, relating the conceptual mini-batch to the realworld notion of time inherent in streaming data.Fourth, R 2 OUDA defines a time constraint such that the time spent training a single time segment cannot interfere with the training for subsequent time segments.
To address all aspects of the new R 2 OUDA setting, this paper further proposes Real-World Real-Time Online Streaming Mutual Mean-Teaching (R 2 MMT).R 2 MMT is an end-to-end multi-camera system designed for real-world person ReID.Using object detection, pedestrian tracking, human pose estimation, and a novel approach for Subset Distribution Selection (SDS), R 2 MMT is able to generate person crops directly from a data stream, filter them based on representation quality, and create a subset for training with a suitable distribution.To show the viability of R 2 MMT to meet the challenges of real-world applications, and to explore the breadth of the R 2 OUDA setting, an exhaustive set of experiments were conducted on the popular and challenging DukeMTMC dataset [35].Using R 2 MMT, over 100 data subsets were created and more than 3000 models were trained, capturing the trade-offs and limitations of real-world applications and the R 2 OUDA setting.R 2 MMT is a real-world system that can meet the demanding requirements of the proposed R 2 OUDA setting, and is able to achieve over 73% Top-1 accuracy on DukeMTMC-reid, within 0.1% of comparable OUDA methods that cannot be directly applied for real-world applications.
To summarize, this paper's contributions are as follows: -We define the setting of Real-World Real-Time Online Unsupervised Domain Adaptation, accounting for the challenges of real-world applications and bridging the gap between research and application.-We propose Real-World Real-Time Online Streaming Mutal Mean-Teaching, a novel end-to-end multicamera person ReID system designed to meet the challenges of R 2 OUDA and real-world applications.-We perform exhaustive experimentation, creating over 100 data subsets and training over 3000 models, to explore the breadth of the R 2 OUDA setting and understand the trade-offs and limitations of realworld applications.

Related Work
The UDA setting for person ReID has been extensively explored by the research community [24,36,46,51].In general, there are two main categories of algorithms used to perform UDA for person ReID: style transfer methods and target domain clustering methods.

Style Transfer
Style transfer based methods generally use Generative Adversarial Networks (GANs) [15] to perform imageto-image translation [20], modifying images from the source domain to look like the target domain without affecting the context of the original images.[4] uses self-similarity and domain-dissimilarity to ensure transferred images maintain cues to the original identity without matching to other identities in the target domain, while [14] introduces an online relation-consistency regularization term to ensure relations of the source domain are kept after transfer to the target domain.[28] separates transfers into factor-wise sub-transfers, across illumination, resolution, and camera view, to better fit the source images into the target domain.[2] uses a dual conditional GAN to transfer source domain images to multiple styles in the target domain, creating a multitude of training instances for each source identity.[42] uses a cycle consistent loss [54] with an emphasis on the foreground to better maintain identities between styles.[19] looks at domain shift as background shift and uses a GAN to remove backgrounds without damaging foregrounds, while a densely associated 2-stream network integrates identity related cues present in backgrounds.

Target Domain Clustering
Target domain clustering approaches focus on using clustering algorithms to group features of the target domain for use as labels to fine tune a neural network pre-trained on the source domain [7].This is usually done in an iterative fashion, where clustering is performed between training epochs to update the group labels as the model learns.[45] proposes using a dynamic graph matching framework to better handle large cross-camera variations.[10] introduces a self-similarity group to leverage part-based similarity to build clusters from different camera views.[27] utilizes a diversity regularization term to enforce a uniform distribution among the sizes of clusters.[13] introduces hybrid memory to dynamically generate instance-level supervisory signal for feature representation learning.[11] builds on [38], using two teacher models and their temporally averaged weights to produce soft pseudo labels for target domain clustering.[3] utilizes both target domain clustering and adversarial learning to create camera invariant features and improve target domain feature learning.

Online Unsupervised Domain Adaptation
While Online Unsupervised Domain Adaptation has been explored for other AI tasks [6,17,23,29,30,39,47], it was first defined for the field of person ReID in [33].
OUDA for Person ReID aims to create a practical online setting similar to that found in practical applications.OUDA builds upon the UDA setting by adding two considerations.First, data from the target domain is accessed via a data stream and not available all at once.Second, due to confidentiality concerns common in many countries, data from the target domain can only be stored for a limited time and only model parameters trained on that data may be persistent.
The proposed setting of Real-World Real-Time Online Unsupervised Domain Adaptation, building off OUDA [33], considers that we have access to a completely annotated source dataset D S as well as partial access to an unlabeled target dataset D T in the domain of our target application.In contrast to standard UDA, in both OUDA and R 2 OUDA the data from D T is only accessible as an online stream of data.Whereas both UDA and OUDA use person crops from hand crafted datasets, R 2 OUDA specifies that person crops from D T must be generated algorithmically from the data stream.This reflects how data is gathered in the real world.Where hand selected crops from datasets are generally highly representative, crops generated from a data stream will have varying levels of quality.This introduces noise in D T , both in quality and in the inevitable missed detections, which needs to be accounted for.
Additionally, hand crafted datasets choose person images to fit a distribution suitable for training.However, since crops in R 2 OUDA are generated from streaming data, such a distribution can not be assumed.This leads to the second consideration of R 2 OUDA, that the distribution of data to be used in training must be determined algorithmically.Instead of relying on a predefined set of person images, systems must generate their own data subset, determining its size and distribution appropriately.This also reflects the real-world, as it is rarely known beforehand the amount and distribution of person crops that will be collected by an application.
Continuing with the batched-based relaxation [9] of the online learning scenario proposed in [33], we further introduce a time constraint for R 2 OUDA.First, instead of separating our "mini-batches" ("tasks" as defined in [33]) across identities, since R 2 OUDA requires actual streaming data, the data stream is separated into discrete time segments.We consider that for a chosen time segment of length τ , the streaming data will be divided into equal, non-overlapping time segments of length τ whose combined contents are equivalent to the original data stream.
For R 2 OUDA, we must account both for applications that run continuously (i.e. the total length of the data stream is infinite) and the fact that, in the real world, computation resources are not unlimited.This leads to the necessity of a time constraint, but one that is not simple to define.Training time is inherently linked to hardware, and there are many techniques to hide latency or increase throughput in system design.As such, we simply define the time constraint such that, for any time segment τ i , the length of time spent training on data collected during τ i must be such to not interfere with the training for the data collected during τ i+1 .This is to prevent the training time deficit from increasing infinitely as i increases.
In summary, R 2 OUDA introduces four new considerations to better match real-world applications: On the Local Node, YOLOv5 [22] is used as an object detector to find people in the video stream.Image crops are created for each person and sent to both a pose estimator (HRNet [37]) and a ReID feature extractor (ResNet-50 [16]).Coordinates for each person and features generated by the feature extractor are sent to a tracker [44] for local ReID.Afterward, feature and crop selection are performed to ensure that features and person crops sent to the Global Node for global ReID and crop collection are highly representative.This process utilizes person bounding box coordinates from the tracker to filter out any persons that have significant overlap (IoU >= 0.3) with other persons.This limits the number of crops used for training and features used for ReID contain multiple persons.The pose estimator is used to determine the quality of the features themselves.We reason that if a highly representative feature is present, then poses generated from the person crop should be of high confidence, while the number of keypoints present can help determine if there is significant occlusion or cutoff.Only crops and features with poses containing 15 or more keypoints (out of 17 total [26]) with at least 50% confidence are sent to the Global Node.
On the Global Node, local identities and features are received from the Local Nodes and sent to a matching algorithm.This matching algorithm, as described in [31], performs global (i.e.multi-camera) ReID.Concurrently, person crops from all cameras are collected for a single time segment.Generally, far more features will be collected than can reasonably be used during training.For instance, when DukeMTMC-Video [35] is sampled every frame, the system produces over 4 million crops that pass feature selection.To reduce redun-dancy and computation, R 2 MMT samples crops for selection once every 60 frames.
After all person crops from a single time segment are collected, the Subset Distribution Selection algorithm is used to create a subset that maintains a distribution and number of crops suitable for training.R 2 MMT uses an SDS algorithm based on the metric facility location problem [34].We define that given a number of features in a metric space, we wish to find a subset of k features such that the minimum distance between any two features within the subset is maximized.However, this problem is known to be NP-hard [21], making it unsuitable for our real-world applications.R 2 MMT instead uses a greedy implementation of the algorithm proven to be Ω(log k)-competitive with the optimal solution while proving to be significantly faster, especially for larger sets of data [1].For ease of readability, we adopt the nomenclature of K to mean the number of instances per identity.Therefore the total number of person crops in a subset k is equal to the number of identities in the dataset times K.To further reduce complexity, SDS is performed on the data from each camera individually, and their results are combined to form the complete subset.
Once the training subset is created, domain adaptation is performed using Mutual-Mean Teaching (MMT) [11].R 2 MMT follows the training methodology described in [11], except that epochs and iterations are variable.Clustering is done using DBSCAN [5], as GPU acceleration allows it to perform much faster than CPU based approaches.Exact training parameters, both for pretraining on the source domain and domain transfer on the target domain, are as detailed in [11] unless otherwise noted.
Both SDS and training are time consuming, particularly when dealing with large amounts of data.To meet the time constraint of the R 2 OUDA setting, R 2 MMT utilizes a pipelined processing model, taking advantage of parallel computing resources while hiding the latency of the aforementioned tasks.An illustration of this pipelined approach can be seen in Fig. 2. Crop collection, SDS, and training are separated into their own pipeline stages.This means that while a model collects data for the current time segment, SDS on that data will occur the following time segment, and the training for that subset will occur the time segment after that.More formally, during a single time segment T N , a model trained on data from T N −3 is used to collect data from time segment T N , while subset distribution selection is performed on data collected during T N −1 and another network is being trained on a subset created from data from T N −2 .All of these processes will finish before T N +1 .This means there will always be a latency of two time segments between collection and inference for a single time segment.However, due to the pipeline structure, training throughput remains at a rate of one time segment per time segment.This satisfies the time constraint of R 2 OUDA.

Experimental Results
To explore the setting of R 2 OUDA, we select the Market 1501 dataset [50] as the source domain and the DukeMTMC dataset [35] as the target domain.The DukeMTMC dataset is desirable as a target domain because it has both a video dataset (DukeMTMC-video) and a hand crafted person ReID dataset (DukeMTMCreid), both in the same domain.The video dataset is required in order to satisfy the streaming data constraint of the R 2 OUDA setting.The hand crafted ReID dataset brings two benefits.First, it allows us to directly observe the effect of noisy system generated crops compared hand selected person images when used for training.Second, testing on the ReID dataset allows direct comparison with works done in the UDA and OUDA space.As such, all our Top-1 accuracies are reported on the DukeMTMC-reid dataset.Similarly, we determining subset size, we treat the number of identities for both DukeMTMC-reid and DukeMTMC-video to be 702, as described in [35].The number of person crops in a subset k is always equal to k × 702.
For all experiments, R 2 MMT is used to perform domain adaptation.Parameters in all experiments are the same as in [11], except where noted otherwise.All Local Nodes are run on a single server with two AMD EPYC 7513 CPUs, 256 GB of RAM, and three Nvidia V100 GPUs.The Global Node is run on a workstation with an AMD Threadripper Pro 3975WX CPU, 256 GB RAM, and three Nvidia RTX A6000 GPUs.All timing results presented in this section are using this Global Node.

Subset Distribution Selection
We first explore the effect of using our baseline Subset Distribution Selection algorithm for training on the DukeMTMC-reid dataset.By using hand selected person crops from the dataset, we remove the effect of noise generated by our system and single out the impact of our SDS algorithm and the reduction in amount of data on domain adaptation.We vary the number of person images per identity K, iterations per epoch I, and total epochs E as shown below.Note that using the entire DukeMTMC-reid dataset would be equivalent to K = 25.
K ∈ [2,4,6,8,10,12,14,16,18,20] I ∈ [100, 250, 500, 750, 1000, 1500] These variable ranges lead to 240 training permutations, which is difficult to list in a single table.Instead, the results are plotted in a three-dimensional space and can be seen in Fig. 3. Training Time and Top-1 make up the x and y axes, Epochs are the z axis, Iterations are noted by color, and k is indicated by size, with bigger circles representing higher values of k.As the purpose of these experiments is to focus on the effects of our SDS algorithm, the system pipeline described in Section 4 is ignored and timing results count SDS and training sequentially.More detailed information on these experiments can be found in the supplementary materials.
From these graphs, we can understand the general trend of the data.Intuitively, we see a fairly linear trend where more data generally results in higher Top-1 accuracy.Likewise, more iterations per epoch and more epochs also tend to result in higher accuracy.Interestingly, with lower values of k we see the reverse effect; more time spent training results in decreased accuracy, sometimes even below the pre-trained accuracy of 42.0%.In general, at least 6 person images per identity are needed to consistently learn, while we start to see diminishing returns at around 16 person images per identity.The top result occurs when K = 20, I = 1500, and E = 5, achieving a Top-1 accuracy of 74.55% with a training time of 82 minutes.This is only 3.5% less than what comparable algorithms are able to achieve in the UDA setting [11] and over 2% greater than the same algorithm in the OUDA setting [33].When using the same hardware, R 2 MMT is 2.6× faster than its UDA counterpart.

System Generated Data
As explained in Section 3, one of the requirements of the R 2 OUDA setting is that person crops must be generated algorithmically from a data stream.As such, it is necessary to explore the effects of the noise this introduces.The structure of these experiments are exactly the same as in Section 5.1, except that instead of using DukeMTMC-reid, R 2 MMT generates data from the DukeMTMC-video dataset.Similar to Section 5.1, we ignore the system pipeline and focus on the effects of the generated data.Based on the larger amount of data available in DukeMTMC-video, the ranges for our experimental variables are adjusted as shown below.Using all generated data would be equivalent to K = 99.K ∈ [16,18,20,25,30,40] I ∈ [100, 250, 500, 1000, 1500] The results of this exploration can be seen in Fig. 4, with more details available in the supplementary materials.Axes are identical to Fig. 3, with color and size representing iterations and k respectively.These graphs show a somewhat similar trend as in Section 5.1 with some interesting deviations.While the trend starts off with accuracy increasing as k gets larger, there is a sharp decrease in accuracy when k increases beyond a certain point.The scale of the decrease, as well as how early it occurs, lessens with both iterations and epochs.This is likely a byproduct of how many identities are present in DukeMTMC-video.While DukeMTMC only labels a total of 1404 identities, our system is able to detect far more.Increasing iterations has such a drastic effect here because it determines how many of and how often these identities are seen during an epoch.Further increasing iterations and epochs could help mitigate this, but would also increase overall training time.This, combined with the fact that more epochs and more iterations always result in higher accuracy, suggests that accuracy saturation has not been reached here, and the main limiting factor is training time.The highest accuracy achieved on this noisy data was a Top-1 of 69.34%, with K = 20, I = 1500, E = 5, and a total training time of just under 57 minutes.This is notably worse than both the 74.55% achieved in Section 5.1 and the 72.3% MMT achieves in the OUDA setting [33].This demonstrates the extreme impact noisy data can have on unsupervised domain adaptation, and why the extra considerations of the R 2 OUDA setting are a necessity when designing algorithms for real-world applications.

R 2 MMT
Finally, we make the first attempt at addressing the R 2 OUDA setting.An exhaustive set of experiments are conducted with R 2 MMT, producing a fully functional, end-to-end system that meets all the requirements of the R 2 OUDA setting.R 2 MMT generates person crops from a stream of data, uses SDS to construct training subsets, operates on the notion of time segments, and must adhere to the strict time constraint outlined in Section 3. A successful implementation will conform to all of those standards while achieving the highest accuracy possible, ideally within range of what was seen in Section 5.1.
One hour of DukeMTMC-video is used as the data stream, split into equal sized continuous segments of size τ .SDS is performed at each time segment on each camera individually, and k refers to the total number of person crops across all training subsets for the full hour.Two methods are used to determine the number of crops needed at each time segment.In the standard method, only data collected in a time segment may be used for training related to that time segment.The second method uses a form of memory, allowing the use of data from the current time segment and previous time segments still in memory.For these experiments, we assume a memory length of up to 60 minutes.Equation 3 and Equation 4 are used to calculate the number of person crops needed from each camera at each time segment, for the standard and memory based methods respectively.
where k is the total number of person crops desired for the training subset over an hour of video stream, τ t is a time segment of length τ minutes that begins at τ ×t minutes, C i is the i th camera, P (C i ) is the percentage of total person crops received from C I when compared to all cameras over an hour of video, and P (C i )P (C i ∩ τ t ) is the percentage of person crops received during τ t for C i compared to all person crops received from C i over an hour of video.
This ensures the number of person crops selected for a subset from each camera at each time segment is proportional to the number of person crops received.The variable ranges used in these experiments are shown below.
K ∈ [18,20,25,30,40,50] I ∈ [100, 250, 500, 750, 1000, 1500] This creates over 2500 data points across the two methods, becoming difficult to visualize even in three dimensional space.Fig. 5 displays the distribution of training accuracies for each τ at each time segment.Out of the 864 configurations tested, more than half of them failed to consistently meet the time requirement of R 2 OUDA and are not included in the statistics.Most notably, all configurations that used memory failed to consistently meet the time requirement when given a τ of 15.When memory is utilized, the time required for SDS greatly increases for successive time segments as more images accumulate.This limits how large k can be, restricting K to 20 or below when τ = 20 and 30 or below when τ = 30.Even without memory, the time constraint proves very limiting.Only when τ = 20 is the entire range of K able to be utilized.For a more fine grain look at all 2500+ data points in this experiment, please see the supplementary materials.
The data in general follows similar trends as seen in Section 5.1 and Section 5.2, but to more of an extreme.In addition to disqualifying several configurations off the bat, the segmented data stream and time constraint   generally mean R 2 MMT has less data to work with during any given training.Unlike in the previous experiments, the time constraint prevents the system from just throwing more data and more training at the problem.Instead, a balance must be found.We see an overall increase in top accuracies when τ increases, both in standard and memory configurations.Top accuracies also increase over time, with one notable exception.When τ = 15, accuracy actually drops in the final time segment.This is due to the extremely low amount of data available in that particular time segment.
Another interesting observation can be made by looking at τ = 20 both with and without memory.While the standard R 2 MMT achieves higher overall top accuracies, the distribution is a lot more varied when compared to R 2 MMT with memory.Many configurations actually lose accuracy, far more than when memory is present.This suggests that while memory is limiting, it may add stability to training over time.This is further demonstrated when τ = 30.When memory is used the maximum accuracy is lower in the first time segment, being restricted to a lower value of K, but is higher in the second time segment due to the increased range of available data.Fig. 6 shows the best configurations of R 2 MMT, both with and without memory, for each τ .The overall highest accuracy is achieved with memory when τ = 30, K = 30, E = 5, and I = 500, reaching an impressive 73.2% Top-1.Despite the much harsher requirements of the R 2 OUDA setting, this is within 0.1% of the best possible accuracy using MMT in the OUDA setting [33].However, with a τ of 30 it also has a latency of 60 minutes between collecting data and inferencing with a model trained on that data.This can be reduced to 30 minutes by changing τ to 15, but then accuracy drops to a disappointing 58.08%.A τ of 20 splits the difference, achieving a final Top-1 of 69.97% while reducing the inference latency to 40 minutes.This is within 4% of our best overall result, and reduces the delay by over 30%.
The strict time constraint disqualified many of the configurations in Section 5.3.However, if we ignore the time constraint for a moment we see accuracies reaching up to 76.53% when τ = 15, K = 40, E = 5, and I = 1500 in a system with memory, putting it within 1.5% of MMT in the UDA setting [11].With further optimization or more powerful hardware, R 2 MMT might be able to achieve higher accuracies with decreased la-tency between collection and inference.This shows that there is a lot of room for improvement and growth in the R 2 OUDA setting.The explorations in this paper can serve as a guideline for future works.

Conclusion
This paper proposed the setting of R 2 OUDA, to better represent the unique challenges of real-world applications.R 2 MMT was introduced as the first attempt at a real-world, end-to-end system that can address all the demands of the R 2 OUDA setting.An exhaustive set of experiments were conducted, using R 2 MMT to create over 100 data subsets and train more than 3000 models, exploring the breadth of the R 2 OUDA setting.While meeting the harsh requirements of R 2 OUDA, R 2 MMT was able to achieve over 73% Top-1 accuracy, reaching within 0.1% of comparable SotA OUDA approaches that cannot be directly applied to real-world applications.

Fig. 3 :Fig. 4 :
Fig. 3: Results exploring SDS on the hand crafted DukeMTMC-reid dataset [35].(a) and (b) show two views of the results plotted in three-dimensional space, while (c) shows a two-dimensional view when E = 5.Larger circles represent larger values of k.

Fig.
Fig. Best results for each system configuration.Dashed lines (--) represent standard configurations.Solid lines (-) represent configurations with memory.Green, blue, and purple denote τ values of 15, 20, and 30 respectively.

Table 1 :
. Unsupervised Challenges of Real-World Applications and if they are addressed in the UDA, OUDA, and R 2 OUDA settings.
†Streaming data is simulated.