Orthogonal Constrained Meta Heuristic Adaptive Multi-View Clustering Over Multi Labeled Categorical Data Analysis

In data mining, clustering is the one of the efficient research concept in real time data analysis, evaluation of attribute representation in clustering is main issue in artificial intelligence related research areas. Multi labeled clustering gives high amount of valuable data, which describes the evaluation and representation of attribute be the trending concept in multi labeled categorical data analysis. Multi dimensional clustering is combined complementary data from different dimensions to provide efficient clustering results in various conditions. Different multi view clustering techniques are proposed traditionally but they can give output as single clustering with input data. Because of multiplicity, multi dimensional data can have different grouping data which are reasonable consist perspective attributes. So how to find measurable and reasonable cluster results which are represented in multi view labeled data is still challenging task, so that in this paper, we propose a novel approach i.e. Orthogonal Constrained Meta Heuristic Adaptive Multi-View Clustering (OCMHAMVC) to represent data as a cluster with different categories. Based on multi labeled data, first proposed approach evaluates low dimensional data using optimized matrix factorization (OMF) method and clusters the similar labeled sample data into prototype cluster of dimensional data. After that we represent data in desirable orthonormality constrained view of data using adaptive heuristic to combine complementary data from different dimensions, also provide complexity in computational analysis of data representation. Experimental results of proposed approach applied on high amount of multi view data gives scalable and efficient performance with comparison to traditional multi view related clustering approaches.


Introduction
Because of rapid implementation of computer related technology, huge amount of data to be collected from different research areas like image processing, computer vision related fusion data, processing natural language real time data. All these data consists high dimensional features with complex structures and it explores different types of dimensions related to diverse features [1][2][3][4][5].These high dimensional data describes about abundant data and curse of dimensionality, so managing high dimensional data in terms of concern widespread is the major issue to optimize the dimension. Reduction of dimensionality is am efficient solution for abundant data which is matched input data at low dimensional space complexity and representation of low dimensionality from hidden data with respect to input data. Different reduction of dimensionality methods have been used for multi labeled data, based on theoretical analysis optimized matrix factorization be the research hotspot with easier implementation. Independent Component analysis (IDC) [6] , Principle component analysis (PCA) [7], quantization of vector (VQ) [8] etc are the main factorization matrix related approaches which are having maximization of lowrank matrix formation from high dimensional data, from that we can efficiently extract and represent low-dimensional attribute relation from high dimensional data relations. All these approaches are not utilized with elements in matrix in the process of decomposition of matrix; it means maximization of matrix representation consist negative elements in representation of data in low-dimensions. As of late, deep learning has shown extraordinary execution in include portrayal undertakings [18][19][20]. Consequently, numerous analysts have brought profound learning into lattice factorization and proposed countless profound component portrayal strategies [21][22][23][24][25][26][27]. J. H. Ahn et al. [21] proposed multilayer nonnegative matrix factorization (MNMF). Unique in relation to customary NMF-based methodologies, MNMF decayed the coefficient grid a few times to acquire a fundamental part-based portrayal that can remove profound various leveled highlights from the first information. Also, to grow the scope of the implemented application, Trigeorgis et al. [22] semi-NMF with coordinated profound factorization to propose a profound semi-non-negative grid factorization (profound semi-NMF) strategy. Nonetheless, both MNMF and profound semi-NMF just viewed as the profound decay of the coefficient network for the preparation information. For formation of new data issues, the premise grid was utilized to acquire the profound reduce the dimensional portrayal. In this manner, the premise framework straightforwardly influenced the consequences of the profound portrayal relates to lowdimensional. To explore the precise reduce-dimensional portrayal with profound of the first information framework, Zhao et al. [23] applied factorization relates to profound to the premise framework and proposed a profound NMF strategy dependent on-premise picture learning.
This problem may appear because of invariant data which is collected from different data domains/sources then that data should be represented in different views of data. This problem is solved with multi labeled clustering to represent data into different dimensions based on their representation of features and relation with multi features. Previously different types of multi view clustering related approaches are introduced but they are not discussed with optimality reduction to dimensionality in representation of multi labeled data with associative representation of features in supervised learning. Supervised learning approach is describes about labeled information based on their feature then it can easily identify the text. So it is necessary to identify unsupervised learning multi label data sources. So how to find measurable and reasonable cluster results which are represented in multi view labeled data is still challenging task, so a novel approach i.e. Orthogonal Constrained Meta Heuristic Adaptive Multi-View Clustering (OCMHAMVC) is proposed to represent data as a cluster with different categories. Based on multi labeled data, first proposed approach evaluates low dimensional data using optimized matrix factorization (OMF) method and clusters the similar labeled sample data into prototype cluster of data relates to multi-dimensions.
Main objectives of proposed approach as follows: a) First we propose unsupervised multi labeled clustering which works based on orthonormality matrix factorization (which is the combination of normalization constraint and orthogonal constraint), it imposes to representation of regularities with different view of data. b) We implement an objective model, it offers and represents minima of proposed implemented model c) We perform experiments on several kinds of real time data sets which describes the efficiency of proposed approach with respect to traditional approaches in terms of accuracy and other parameters from multi labeled cluster data.

Review of Related Work
This section describes the relation between traditional clustering approaches which are applied on multi-labeled data.
Before, many grouping strategies on the single view data have been proposed. Ordinarily, these current single-see gathering approaches can be by and large isolated into three characterizations, specifically piece bunching approaches [22][23][24], supernatural grouping approaches [25][26][27], and subspace grouping approaches [28,29]. Part gathering approaches commonly use bit abilities to design the main commitments to a high dimensional piece space where grouping can be performed capably. For example, M. Tong et al. [25] use a Gaussian piece to design the commitments to a divided space and wire pair savvy constraints into part sorting out some way to coordinate the pattern of the gathering. Li et al. [23] use a social occasion of pre-shown bits to design the wellsprings of data and enhance the piece game plan locally to further develop bunching execution. Y. Li et al. [29] take in an optimal area bit from a social occasion of pre-decided pieces and brace the trade between part learning and grouping. For ghastly grouping draws near, they typically develop a partiality chart on the examples of information to describe the information similitude, and adventure the eigenstructure of this fondness diagram to acquire the clustering result. Existing unearthly clustering approaches [25][26][27] spotlight the most proficient method to develop this proclivity chart.
For example, R. Zhang et al. [32] utilize the pairwise requirement data to build the inborn diagram and the punishment chart, separately. D.A. Spielman et al. [36] misuse discriminative component subspace to foster an amazing affection graph for further developing the supernatural bunching execution. D. Hidru et al. [37] fabricate a reformist bipartite diagram by mishandling multi-layer gets with a pyramid-style structure. Other than the above work, a help vector machine (SVM) is furthermore adequately applied in the gathering structure. C. Xu, Z. Guan et al. [40] propose a twin help vector machine (TWSVC) framework, in which they search for the bundle plane that is close to the spots of the contrasting pack and evades the characteristics of the rest gatherings. Lately, heaps of assessment practices on MVC have achieved promising execution reliant upon Non-negative Matrix Factorization (NMF) and its varieties, because the non-threat impediments think about better interpretability (Guan et al. 2020;Trigeorgis et al. 2018). The general idea is to search for an average inert factor through non-negative structure factorization among multi-see data (Liu et al. 2018;Zhang et al. 2019;. Semi Non-negative Matrix Factorization (Semi-NMF), as maybe the most popular varieties of NMF, was proposed to extend NMF by relaxing up the factorized premise structure to be authentic characteristics. This preparation licenses Semi-NMF to have a more broad application truly than NMF. Besides researching Semi-NMF in MVC application interestingly, our strategy has another separation from the momentum NMF-based MVC techniques: As addressed, through the significant Semi-NMF structure, we push data tests from a comparable class closer layer by layer. We get the idea from significant learning (Bengio 2019), henceforth this preparation has such a flavor. Note that the proposed method isn't exactly equivalent to the current significant auto-encoder-based MVC approaches (

Preliminaries
This section describes the basic preliminaries used in proposed approach with appropriate operations.
Factorization of deep learning described as Eqn. (3), Where Based on multiple perspectives with objective functionalities is described as Eqn. (5)

b) In-depth factorization of deep matrix indices
To define or explore structure relates to complex operations and eliminate irrelevant data present in matrix A with associative modularity's. Deep learning of optimized matrix functionalities is described as Eqn. (7),  be the grouped cluster formation or individually identified factorization of matrix formation with m-dimensional layers. So that representation of different data sets with same group operations with different perspectives on the view of multi labeled data. So deep matrix factorization method is applicable to handle multi labled clustering with multiple attribute relations. Based on these preliminaries, we propose a novel heuristic model to explore multi labeled data clustering with augmented matrix formation.

Proposed Model
This section discuss about proposed approach with unsupervised learning clustering approach i.e. OCMHAMVC, this approach associate with constraints relates to orthogonal & combined framework with co-regularization. We first describe the multi objective functions of proposed and then explore optimized maximization approach to cluster multi labeled data. Finally we analyze the computational analysis of proposed method with efficient complexity.
ij C be the attribute data with ith and jth classes c ij =0 then it labeled data with l samples, unlabeled sample l n -1 defines ( 1)( ) n n l  related to formation of identity matrix. If samples of unlabeled data are higher than threshold cluster data then using identity matrix explores the labeled constraint data from unknown labeled data. For example sample data (l+1) is not defined to implemented cluster, labeled matrix data formation is described as Eqn.
We describe auxiliary matrix Z, for dimension represents with sample data, multiple objective functions for orthogonal matrix formation described as below, from this multi-objective function, explore low-dimensional feature representations for executed sample data with dimension. For each and every dimension explore representation of desirable attribute. Desirable feature representation consists two basic properties i.e. attributes between classes consists efficient discriminating factor and scalability of all the selected features should be similar. To satisfy representation of desirable feature presentation, use orthogonal constraints in multi labeled clustering. Orthogonal constraint consists low-dimensional feature representation with associative clustering prototype with different parametric notations, for efficient orthogonal constraint on discriminating factor between classes and attributes. So migrating the performance of orthogonal constraints with joint constraint matrix is described as Eqn. (10), Then orthogonal constraint framework described as By applying KKT conditions on implemented multi labeled attributes described as Eqn. (12), Final obtaining multi-objective function is described as Eqn. (13), Based on above formation of objective algorithm to be described as follows: Using this algorithm, convergence of multi dimensional data clustering.

c) Multi labeled Dimensional Clustering
Based on the similarity measure, we formulate the multi label clustering with basic cluster functions, first cluster function; we evaluate average similarity weight measure from input documents which are from same cluster. Architecture to explore multi dimensional document clustering described in figure 1.   (14) m be the no.of documents, n be class labels, we would like to optimize the functionality to find out the similarity function as Eqn. (15) Based on similarity behind cluster formation with multiple multi label functionalities, we describe the optimization of multi labeled clustering with weighted cluster functions mainly depends on cluster functionality to be evaluated as Eqn. (19) (19) During this optimized multi labeled clustering with different sources, create optimized cluster results and then update convergence cluster matrix relations with similarity measure based on number of iterations applied on different input label data.

Experimental Evaluation of OCMHAMVC
This section describes the performance of OCMHAMVC with comparison to traditional approaches in terms of different parameters. Verify the proposed approach advantages with experimental document data, OCMHAMVC used similarity measure with different weighted cluster functions for multi labeled document clustering, similarity measure clustering functions works based on Euclidean distance similarity of cosine similarity and relative jacquard coefficient measurement.

a) Input clustering data
The data used for multi labeled document clustering consists real time benchmark data sets, earlier we use k1b, Reuter's 8-10 versions for document clustering, from this data, in this experiment, and we include other benchmark data sets from efficient and exhaustive data sources. Measurable similarities of downloaded data sets which are together with cloud related data sources, by using these data sets perform real time clustering applications. Mainly in our proposed approach, we use BBC Seriesdataset, Reuter's dataset, Series 3 sources dataset and MSRC dataset (http://mlg.ucd.ie/datasets/segment.html., http://mlg.ucd.ie/datasets/3sources.html, http://lig-membres.imag.fr/grimal/data.html, http://www.vision.caltech.edu/Image Datasets/Caltech101/., https://pgram.com/dataset/msrc-v1/. ) which are having lots of data in terms of HTML documents with different input data sources (entertainment, politics, and sports, medical and business related applications). Proposed approach performed on all these datasets and evaluates the performance evaluation with different traditional multi dimensional clustering approaches MultiNMF (local graph regularization with nonnegative matrix factorization based Multi-view clustering via joint)

) Setting of Experiments
With comparison of traditional approaches with proposed approach in terms of different parameters from their original papers, different samples are taken from data sources randomly with removal labeled data from input data, from all input data, search parameter weight with different notations. In our experiments, we use multi labeled dimensional clustering based on similarity weight measure with Euclidean distance metrics. By applying these metrics in our proposed approach with evaluate the performance of each cluster data set in evaluation of different metrics like accuracy, normalization factor, jacquard coefficient,precision, recall, fscore and performance of computational cost and memory utilization. Figure 2 describes the performance of proposed approach with traditional approaches in terms of multi label precision in document retrieval.    Figure 3 evaluates the performance of recall with different multi labeled html text content.

Figure 3 Performance of clustering results in recall.
As shown in figure 4, it describes the performance of f-score i.e. matched multi dimensional documents from html text documents, it shows that OCMHAMVC gives best results with comparison to traditional approaches when increase the documents then OCMHAMVC gives efficient cluster results with multi attributes.

Figure 4
Performance of multi-dimensional clustering results in F-score. Table 3 shows multi-dimensional clustering values with respect to F-score to increase the text documents relates to multi labels.  Figure 5 show the accuracy performance of five different clustering approaches applied on 100-500 html text documents with high preference of multi labeled data collection. Clustering results are presented in different way based on accuracy values shown in table 4, for each data set present in row, best value is represented in bold and remaining values are second best results of different approaches. Table 4 Multi-dimensional clustering results in accuracy.

Figure 6 Performance of time in retrieval of multi-dimensional clustering results
Clustering results are presented in different way based on accuracy values shown in table 4, for each data set present in row, best value is represented in bold and remaining values are second best results of different approaches.  100  109  101  88  74  62  200  114  107  97  84  70  300  118  110  93  88  76   400  124  108  95  90  83   500  130  116  101  95  88  600 135 120 109 99 92 Figure 7 show performance of five different clustering approaches applied on 100-500 html text documents with high preference of multi labeled data collection in computational cost with preferable parameters.

Figure 7Performance of computational cost in processing of CPU usage
Clustering results are presented in different way based on computational usage in CPU processing values shown in table 6, for each data set present in row, best value is represented in bold and remaining values are second best results of different approaches. As shown in figure 7, it describes the performance memory utilization in multi dimensional documents from html text documents, it shows that OCMHAMVC gives best time results with comparison to traditional approaches when increase the documents then OCMHAMVC gives efficient cluster results i.e. less memory utilization to retrieve multi attributes relational documents which contain different domains Figure 7 Performance evaluation of memory utilization in processing multi labeled data. Table 3 shows multi-dimensional clustering values with respect to F-score to increase the text documents relates to multi labels