High-order nonnegative blind source separation based on edge features

The nonnegative blind source separation (NBSS) algorithm based on minimum volume simplex (MVS) criterion is excessively dependent on the shape of the mixture scatterplot, resulting in the situation in which the MVS-based algorithm may have no solution. In this paper, we propose a new noiseless NBSS model and introduce edge features into high-order determined NBSS. The edge feature-based NBSS model can compensate for the shortcomings of the MVS-based algorithm. The twice projections (TP) algorithm is designed to replace the existing clustering and regression algorithms; moreover, TP prevents the aliasing phenomenon caused by direct projection and effectively reduces the time complexity of dimension reduction. We search for the coordinates of the density maximum points in 2-D space, and gradually merge them into high-dimensional coordinates, which greatly reduces the complexity of the algorithm. Furthermore, we only make boundedness and nonnegativity assumptions about the source, which makes the algorithm more widely applicable.


Introduction
Currently, blind source separation (BSS) is widely used in image processing [14], video processing [12], speech recognition [3,6], biological signal processing [11] and hyperspectral unmixing [7,15]. BSS refers to the process of separating sources from mixtures observed by a sensor. Since little is known about the sources and the mixing process, this separation process is called "blind". In general, the BSS mixture model in the noiseless case is X = AS (1) where X = [x 1 · · · x m ] T is the mixture, which is composed of m k-length components (m is the number of sensors and k B is the number of samples), and S = [s 1 · · · s n ] T is the source, which is composed of n k-length components (n is the number of sources), A ∈ R (m×n) is mixing matrix (1). The model is similar to a first-order equation in form. Moreover, due to the different relationship between the number of sources and mixtures, BSS can also be divided into determined (n = m), underdetermined (n < m) and overdetermined (n > m) problems, and the determined case is the basis. However, BSS is difficult to solve, because in addition to the unknown sources, the mixing proportion is also unknown. Therefore, BSS generally makes a certain assumption first, and the applicability of the assumption determines the applicability of BSS based on this assumption. The early proposed independent component analysis (ICA) models are based on the of independence assumption of sources. ICA adopts statistical methods and gradually forms a complete theoretical system and algorithm, including kurtosis, negentropy, the minimization of mutual information, maximum likelihood estimation, projection pursuit, and FastICA [5]. If a signal is sparse, that is, most of the source samples are zero and a few are nonzero, BSS is transformed into sparse component analysis (SCA). At present, SCA is still the key to solving underdetermined BSS problems [13]. When a signal is assumed to be bounded, BSS is transformed into bounded component analysis (BCA) [4]. At present, BCA research is still in the initial stage, and a theoretical system has not been formed. In practical applications, NBSS, a branch of BSS, has been gradually derived due to the wide existence of nonnegative signals. In recent years, NBSS has gradually become a research hotspot. The commonly used methods of NBSS include nonnegative matrix factorization (NMF) [9] and geometric methods. NMF estimates A and S by minimizing the KL divergence on both sides of (1). The disadvantage of NMF is that the solution is not unique, so NMF usually applies some assumptions.
The geometric method is an effective method for NBSS. The geometric method uses the scatterplot of mixtures (the coordinate component of each scatter point is formed by the sampling of different sensors at the same time), and the mixtures are obtained by sensors. The first geometric method assumes that the scatterplot is a parallelogram, and uses the parallelogram to estimate A, where the columns of A correspond to the slopes of the sides of the parallelogram [10]. The second geometric method is based on the minimum volume simplex (MVS) criterion [2,7,8], which is widely used at present. The MVS-based method finds the MVS containing the scatterplot, and uses the vertices of this simplex to obtain A. However, the MVS-based method has certain requirements for the shape of the scatterplot, which limits the adaptability. Figure 1 shows the sources used in this paper, and Fig. 1a, b is selected to construct a 2nd-order mixing system, which shows that the MVS-based method has errors due to the special shape of the scatterplot. Figure 2a shows the mixture scatterplot of the mixing system, (b) shows the mixture scatterplot marked with convex hull, (c) shows the minimum volume simplex, and (d) shows the simplex corresponding to the optimal estimation. In this mixing system The coordinates of each scatter in the mixture scatterplot are (x 1 (i), x 2 (i)), i = 1, · · · , k, where k is number of samples. As shown in Fig. 2, in this example, the MVS-based method has a large error. The reason of this error is that the MVS-based algorithm only pays attention to the outermost points of the scatterplot, ignoring most of the interior scatter points. When the sources are rich in information, there are sufficient scatter points on the convex hull, and the MVS-based algorithm is effective. When the source is very simple as in the above example, the shape of the convex hull is greatly affected by individual scatter points; that is, the shape of the convex hull becomes random, and there are large errors in the results. In addition, the MVS-based method also has some disadvantages, such as the difficulty of dimensionality reduction. We introduce edge features into NBSS, which can effectively solve the above shortcomings of the MVS-based method.

Related work
In our previous work [16], we studied 2nd-order NBSS problems based on edge features. We extracted the edge features of the mixtures and obtained the edge feature scatterplot (the edge feature is the arithmetic square root of the square sum of the horizontal edge feature and the vertical edge feature). The edge feature scatterplot is very similar to the mixing speech scatterplot. We made slope statistics on the scatter points of the edge feature and obtained the mixing matrix through the slope with the highest frequency. Figure 3 shows the 2ndorder experimental results of our method. The sources used are still the two images mentioned in Fig. 1a, b. Figure 3b is the edge feature scatterplot, and Fig. 3b is the edge fea- where λ i (i = 1, 2) plays the role of adjusting the signal amplitude and does not affect the thoroughness of separation. It can be determined by a priori probability or actual demand. For the 3rd-order mixing system, the 3rd-order edge feature scatterplot can be projected onto the XOY, XOZ and YOZ coordinate planes for dimension reduction, and then the 2nd-order edge feature scatter point slope distribution can be obtained. Finally the 3-order mixing matrix can be estimated. However, the edge feature scatterplot of the 3rd-order mixing system may appear aliasing in the process of projection. This aliased phenomenon means that the two clusters of the edge feature scatterplot are projected to the same position on the 2-D coordinate plane. Figure 4a shows the 3rd-order edge feather scatterplot, and Fig. 4b shows the aliasing of the projection on the XOY plane. Although this is not an insurmountable problem, it has an adverse impact on the algorithm.

Twice projection
Projection is a feasible and effective method to reduce dimension. For the projection of edge feature scatter points of a high-order mixing system, we must pay attention to the following matters.
(i) There should be no aliasing after projection.
(ii) After projection, the complexity of the algorithm decreases significantly. (iii) The projection algorithm is applicable to mixing systems of different orders. Based on the above idea, we adopt the twice projection algorithm. First, the scatter points of the edge feature are projected onto the plane X + Y + Z = 1 (for the convenience of description, we collectively refer to this plane as M) and then projected onto the three coordinate planes XOY, XOZ and YOZ, and good results are achieved.  Figure 5a shows the edge feature scatterplot. It can be seen that the edge feature scatter points are divided into three clusters, and each cluster corresponds to a column of the mixing matrix. Figure 5b shows the projection on plane X + Y + Z = 1. It can be observed that the projection contains a 2-D simplex, It has three vertices, and each vertex corresponds to a column of the mixing matrix. Figure 6 shows the two projections onto the coordinate plane after projecting onto plane M, (a) on XOY plane, (b) onto XOZ plane, and (c) onto YOZ plane. It can be observed from Fig. 6 that the shape of these projections is still a 2-D simplex.
Any scatter point, can be projected onto plane M : X + Y + Z = 1 with the following formula.   Fig. 3c, to complete the estimation of the mixing matrix of NBSS. For example, the coordinates of the three vertices in Fig. 6a are (x 1 , y 1 ),(x 2 , y 2 ), and(x 3 , y 3 ),and those in Fig. 6b are and (x 3 , z 3 ). Then the coordinates of the vertices of a simplex on plane M are (x i , y i , z i ), i = 1, 2, 3. If the coordinates of two vertices are close in value, the projection on the third coordinate plane can be used for verification. From the analyses, it can be concluded that the final problem of NBSS is to obtain the coordinates of some key points projections on the coordinate plane.

Density distribution of the projection on the coordinate plane
Since the core of NBSS is to obtain the coordinates of key points projections on the coordinate plane, it is necessary to study the distribution of these projections. In Fig. 7, we demonstrate the density distribution projected onto the Y O Z plane. In Fig. 7a shows a color map of the projection density, (b) show a contour map of density, and (c) is the 3D map of the density distribution. From Fig. 7, it can be clearly observed that there are three density extreme points in the projection. Finding the coordinates of these three extreme points can obtain the optimal estimation of the mixing matrix.

Search strategy of extreme points
To determine the density extreme points on the coordinate plane, the DBSCAN algorithm [1] can be used, but the DBSCAN algorithm is very dependent on the "distance threshold" parameter setting, and the time complexity of the DBSCAN algorithm is relatively large. In addition to density clustering, the regional block statistical algorithm can be  used to divide the coordinate plane into several small regions, and the number of points in these regions can be counted to achieve the purpose, but the time complexity of this method is relatively large. Considering the characteristics of the density distribution, we employ the algorithms to determine the statistical abscissa, ordinate and slope distribution. Figure 8 shows the abscissa, ordinate and slope distribution of Fig. 7. We calculate the density extreme points coordinates on the three coordinate planes in the example. There are (0.  Fig. 9 shows the sources estimation. The normalized mean square error is

Expansion of the algorithm to high-order mixing systems
For an n-order determined (i.e.n = m) mixing system, (1) can be rewritten as The goal of the algorithm is to estimate the mixing matrix A; then, the estimations of the sources

Time complexity of the algorithm
The algorithm consumes the most time while searching for the coordinates of density extreme points on the coordinate plane, and the further time is consumed while merging coordinates. When n ≥ 3, the time complexity is O(C 3 n × k), where C 3 n represents the number of combinations of three options from the n options. When n = 2, the time complexity is O(k), where k is the number of mixture signal samples.

Experiment
We perform confirmation experiments for 4th-order, 5thorder and higher-order cases. Due to the limited length of the article, we list the 4th-order and 5th-order results to illustrate the process of the algorithm.

Example of 4th-order NBSS
Let a 4th-order mixing system be denoted as where the sources are depicted in Fig. 1a, b, c and d. First, the edge of each component of the mixtures T is extracted to obtain the edge feature E = e 1 e 2 e 3 e 4 T , and then three components, e 1 , e 2 and e 3 , are selected and projected onto the plane M : Figure 10a shows the scatterplot of edge features, four clusters can be observed, and (b) shows the projection onto plane M. Then, the projection onto M is projected onto the coordinate planes x 1 Ox 2 , x 1 Ox 3 and x 2 Ox 3 , and the coordinates of the density extreme points are obtained. Figure 11 is a contour map of the density distribution on each coordinate plane. It can be observed that its density distribution is similar to the distribution in 3rd-order case.

Example of 5th-order NBSS
For 5th-order or higher-order mixing systems, the algorithm is similar to that for 4th-order NBSS. Let a 5th-order mixing system be  Figure 14 shows When λ 1 = 1, λ 2 = 1, λ 3 = 1, λ 4 = 0.7, and λ 5 = 1, Through the 4th-order and 5th-order examples, it can be found that the computational complexity of the algorithm does not increase significantly with the increase in the order of the mixing system, and the algorithm can realize parallel operation.
For higher-order cases, the algorithm has no obvious differences.

Shortcomings of the algorithm
The proposed algorithm is greatly affected by the source, which is also a challenge for 2nd-order mixing systems [16]; if the source in the example in Sect. 4.1 is defined as a source shown in Fig. 1a, b, c and f, the influence of the source on the algorithm is apparent. For the numerical NBSS method based on edge features, the influence of the source is much smaller, which is mainly due to the use of the Jaccard index [16]. The formula of the n-order Jaccard index is where,ŝ i B , (i = 1, 2, · · · , n) is the binary edge feature of the source estimation, and the operator "·" is the inner product operation. The difficulty of using the higher-order Jaccard index is to reduce the order, which will be discussed in other papers.

Conclusion
Our algorithm is based on edge features, so it can use more mixed scatter points than MVS-based algorithms which only use scatter points on convex hulls. We use the twice projection algorithm to effectively avoid the aliasing of projections and to improve the robustness of the algorithm. Table 1(appendix) compares some common BSS algorithms with our algorithms. It can be seen that our algorithm has certain advantages. In the future, we will further study noisy high-order NBSS problems based on edge features.