Method for clustering and identi�cation of objects in laser scanning point clouds using dynamic logic

Today there is a gap between a presence of various new equipment on the market which provides streams of various digital data about the environment, in particular in the form of laser scanning point clouds, and the lack of adequate e�cient methods and software for information extraction from such data. A solution to the problem of bridging this gap is proposed on the basis of neural modeling �eld theory and dynamic logic (DL). We present a DL-based method of extracting and analyzing information from hybrid point clouds, which include not only spatial coordinates and intensity, but also the color of each point, and can be from multiple sources including terrestrial, mobile and airborne laser scanning data. The proposed method is signi�cant for creating a fundamental theoretical basis for new application algorithms and software for many new applications, including building information modeling, “smart city” environment, etc. The proposed method is fairly new to solving various problems related to extracting semantically rich information from a nontraditional type of digital data, especially hybrid point clouds created from laser scanning. This method will allow to signi�cantly expand the existing boundaries of knowledge in the �eld of extraction and analysis of information from various digital data, because neural modeling �eld theory and DL can improve the performance of relevant calculations and close the existing gap in analysis of digital images.


State Of The Art
Currently, many modern remote sensing technologies, such as laser scanning technologies, generate three-dimensional point clouds [1][2][3].These point clouds are a speci c kind of digital imagery.In general, 3D digital image processing is a resource-intensive task that requires use of specially developed algorithms.The existing proposed solutions are related, among other things, to cognition modeling and arti cial intelligence (AI) [4][5][6].Algorithms for processing 3D point clouds often run into combinatorial complexity, which greatly complicates the task of classifying and dividing such an image into separate objects and assigning attributes to them [7].Traditional algorithms based on mathematical logic for solving such a recognition problem use enumeration and comparison of all known test images (3D point clouds) with the investigated image [8].It is easy to understand that, using even a small set of options for comparison, the number of operations would be gigantic.However, it should be noted that there have been successes in using such algorithms in combination with each other.For example, methods such as convolutional neural networks, K-means, decision trees, latent Dirichlet distribution, support vector machines, and deep learning are among them [9][10][11][12][13].
The search for possible innovations in 3D image processing showed that methods based on cognition modeling have been successful [14][15][16].An original approach to cognition modeling based on the theory of neural modeling elds (NMF) and dynamic logic (DL) was developed by L.I. Perlovsky [17][18][19]55].This approach differs from the usual Aristotelian logic and is intended to overcome the shortcomings of formal logic associated with the combinatorial complexity (CC) of the corresponding algorithms.Instead of well-de ned concepts such as "this is a window", DL is a process "from vague to crisp", realizing the causal principle of cognition, for example, in object recognition, when the current state of the NMF is determined by the previous state [20].NMF and DL have been developed based on data from psychology, philosophy and science about the main mechanisms of cognition, as well as on the analysis of modeling of cognitive functions in AI [17][18][19][20]55].As a result, Perlovsky's approaches make it possible to obtain new results on the formalization of the cognition process, for example, in [21]; a generalization of the theory of NMF and DL in the form of DL phenomena and cognition was obtained.The logical systems used in this case formulate the use of general terms, such as relations of generality, uncertainty, simplicity, teaching methods, the problem of maximum similarity with empirical content.It should be noted that currently available neural networks are too simple to explain human cognition.At the same time NMF is making a step towards explaining cognition and not just striving to improve methods of solving practical problems [22].Whereas Aristotle's logic is a tool for formalizing previously known solutions, it is not suitable for solving new problems related to cognition.This is related to the lack of adaptability of the standard logic and the combinatorial complexity of the corresponding computations.
In this regard, the structure of the NMF is built in accordance with current knowledge and ideas about the organization of the human brain.The universality of cognition and the extraordinary capabilities of the human brain are partially explained by dynamic logic.This paper presents the results of applying these approaches to clustering and classifying digital data in the form of point clouds that are formed as a result of 3D laser scanning.
No publications directly related to the use of NMF and DL methods for clustering and classifying point clouds of laser scanning were found.Therefore, the following is an overview of current research in the eld of intelligent methods of extracting information from various digital data that may be relevant to the problem under consideration.The prospects of the NMF and DL method for the problem of clustering and classifying point clouds are also determined.
In [23][24][25] various methods used in biometrics are considered, indicating the methods, used functions and data sources.Open problems are the mathematical analysis of "soft" biometrics and statistical modeling, including processing of big data [23].For example, the Indian project UIDAI had more than 1.2 billion subjects, this data required improved accuracy and performance of the biometrics algorithm.
The NMF/DL method allows one to create statistical models and quickly work with big data without loss of accuracy, while simultaneously solving problems identi ed in [24], such as: what to combine, how to combine and when to combine.
The problem of road sign recognition is part of one of the most important challenges of the 21st century: autonomous driving [26][27][28][29][30][31].Article [28] presents the authors' own methods and an analysis of recent publications with an indication of the methods used and an assessment of recognition success, while the authors use a test database of 11,000 characters.Recognition of road signs is important for solving the problems of inventorying signs on the country's roads and tracking their condition [29,30], as well as for diagnosing the accessibility of buildings [31].The main obstacles to effective recognition, as in the case of biometrics, are inhomogeneity in the signs themselves, lighting, and orientation.The NMF/DL method can help solve the processing problem in an acceptable time frame and will improve accuracy along with performance when using such large road sign databases.
Adequate formation of instructions for autonomous driving requires simultaneous detection in real time of all the most important elements of the environment.After recognizing road signs, tra c lights, other vehicles, pedestrians, etc., it is necessary to track their changes and combine them with information from all sensors of the vehicle to form control commands [32,33].The main di culties are associated with adequate integration of such a huge volume of the data and their analysis in limited time.The "associative" aspect of the NMF/DL method holds great promise for solving this problem.
Controlling autonomous ight of UAVs (unmanned aerial vehicles) is a technology close to the above mentioned autonomous driving.It is also important for an autonomous UAV to make quick decisions based on the recognition of landscape objects [34][35][36].UAVs are often used to create 3D city models based on scanning data [37][38][39].In [40], the authors consider the construction of three-dimensional city models based on the LIDAR (Light Identi cation Detection and Ranging) technology due to the possibility of obtaining digital images of high density and high accuracy.The development of sensor technologies and corresponding algorithms make it possible to e ciently and accurately construct 3D models in the form of point clouds using photogrammetry methods only on the basis of 2D images from cameras installed on the UAV [30,[41][42][43].DL algorithms can work with both LIDAR data and 2D images from cameras.In any case, the ability to use DL with LIDAR will be a signi cant advantage despite the large amount of data, as it provides an easier way to build 3D models.UAV images are often fuzzy and poorly structured due to unstable UAV motion [40] and the "associative" aspect of the NMF/DL method can also effectively help in addressing this problem.
The NMF/DL method due to its performance, can potentially cope with the huge volume of unstructured heterogeneous digital image data from a plethora of cameras at airports and railway stations, which requires radically new approaches in order to process [44,45].
Other tasks for which effective analysis of digital images is critical and for which NMF/DL can work effectively include ood impact analysis [46].For this problem, destroyed objects, especially residential buildings, are important [47,48].Destroyed buildings can be determined by the presence or absence of walls, modeled by planes [47].The development of robust and accurate methods for automatic building detection and regularization using multisource data has been discussed in [49].
As stated in all of the above mentioned publications, any digital image, including a 3D laser scanning point cloud, usually contains various types of noise and clutter, which are a signi cant obstacle to the operation of many algorithms.When scanning a real environment, these noises are caused by the presence of fog or rain droplets, birds and insects ying by, debris, dust and leaves carried by the wind.
The NMF/DL method can include a speci cation of the noise / obstacle model that will exclude the corresponding noise points, so that the rest of the points can be "cleanly" related to the underlying models.
Experience in the development of methods and algorithms for processing hybrid data of laser scanning and photography has shown that the resulting point clouds without noise and clutter are of great practical importance [50][51][52][53].The main di culties when working with such clouds are associated with the fact that the initial data stream is very heterogeneous, in three-dimensional space only the surfaces of objects are obtained and point clouds are huge.Thus, an urgent task is to develop fundamentally new methods of processing and analyzing images in the form of clouds of laser scanning points.The purpose of this article is to develop methods for this type of data based on the neural modeling elds proposed and constructed by L.I. Perlovsky.The article presents the results that, based on the interpretation of a number of features of thought processes [17,54], demonstrate overcoming the "curse of dimensionality" in the problems of object recognition in the clouds of laser scanning points.

Materials And Methods
The method was developed in accordance with the dynamic logic (DL) approach proposed by L.I.
Let us unravel Perlovsky's theory.Onwards we discuss the equations and main assumptions of the DL method used to build the algorithm.We use the following Gaussian likelihood measure: where X(n) is the input signal at data point n of N, Cm are the covariance matrices for models m = 1,...,М and Mm are the coordinates for the center of the m'th model (i.e.cluster) expressed in the same coordinates as X(n).Gaussian likelihood is used because it is usually well t with many naturally occurring densities.Generally, the data point X(n) is in the form of a vector consisting of 3D spatial coordinates plus, optionally, intensity (I) or color (RGB), or both, or some features.
The proposed method (in its basic form) does not use any prior knowledge of the point cloud, therefore the model centers Mm are initialized randomly and so as to fall within the point cloud's boundary, yet not too close to it.Also, covariances are initialized randomly and so as to be large -on the order of the cloud's size.
Association variables, model rates (priors) and similarity measure are given, respectively, as follows: The general DL equations from [17][18][19]55] for models and parameters are replaced by more convenient and more suitable ones for identi cation and clustering tasks from [55]: Diagram 1 (see Diagram 1 in the Supplementary Files) below presents the iterative nature of the algorithm, meaning each successive iteration uses values calculated in the previous iteration.This justi es the cyclical interdependency of the above equations.
An outline is proposed in [55] for a mechanism for putting super uous or redundant models to sleep and waking up or creating new models when more models are needed.This mechanism is under development and has not so far given entirely satisfactory results and is currently not used in our work.Its development and implementation would be useful in the future, since it would allow to automate the estimation of the number of models M, instead of having to preset this number manually upon initialization.For now we manually set this number at 25 (plus 1 cluster for clutter and noise).This number is chosen intuitively.Its value determines the resolution and size of the clusters; hence we choose it with regards to the order of magnitude of the physical size of objects that we want the algorithm to detect.
Several important tricks are used to signi cantly improve speed and precision of the algorithm, such as using normalized log-likelihood, but they are outside the scope of this article, since we do not intend to describe the algorithm in every detail, but only give its key mechanisms.
At rst we used the algorithm in the obvious, natural way by running it once for some xed amount of models, say M=25.We later discovered a technique that allowed for both better results and faster computation time while allowing for much larger data sets due to less memory usage: in the 1 st run of the algorithm the entire point cloud is broken down into 5 clusters (plus noise), then, in the 2 nd run, each of the 5 non-noise clusters are further broken down into 5 each, resulting in 25 non-noise clusters in total.This technique, which we named "5x5", can handle much larger data sets with many more models M, because, rst-of-all, during both runs the algorithm has 5x less models to handle, and, second-of-all, during the 2 nd run it also uses 5x less input data in each computational branch of the algorithm.Aside from a faster calculation time, this also means we use 5x less computer memory (comparing peak load) with the 5x5 technique.

Results And Discussion
Point clouds from several free sources were used.Outside laser scanning point clouds of a European village (Bildstein, Austria) were obtained from the database Semantic3D (bildstein_station1_xyz_intensity_rgb).Another point cloud (an inside scan of an o ce space) was obtained from ISPRS's case study #5.Our own scans of geometric blocks were used as well.
Intel Xeon E5 CPU and an NVIDIA Quadro P6000 GPU, with the latter relied on for most performanceheavy calculations.
Though LiDAR scans provide XYZIRGB coordinates, we decided to use only the 3 spatial coordinates (XYZ) because color and intensity interfere too heavily with clustering, as many changes in color due to shadows, different materials and angle of incidence do not correspond to meaningful cluster differentiation.We demonstrate this in Fig. 1, where in the original shots the color scale corresponds to intensity (I), while the coloring in the resulting shot corresponds to clustering done over all 7 of the XYZIRGB coordinates.
As can be discovered in Fig. 1, intensity of a point depends a lot on the angle of incidence of the laser beam, which is useful for clustering and discerning a roof from a façade (as seen by the booth), but when a façade wall is round, this is not useful.The same is true in the case of the back side of the car on the left of the image (Fig. 1).
Below in Fig. 2 we present the results of clustering the Bildstein scan into 25 clusters + 1 noise cluster using both techniques from "Materials and methods": 5x5 and "in one go".Note that light gray represents the noise cluster in all gures.
Instance 1/2 is the intermediate result, and we must look at instance 2/2 of the 5x5 technique, which shows that the 5x5 technique results in slightly better clustering (and much faster), when compared to the "in one go" method.Still, neither method is ideal: for example, parts of trees and buildings cluster together, while those objects themselves are split into different clusters.It must be pointed out that the algorithm proposed is useful only for clustering and not classi cation, and hence, in its basic form, is not intelligent enough to distinguish objects.
Furthermore, the positioning of the laser scanner has an enormous effect on the clustering.Local density of points is orders of magnitude higher at distances close to the scanner, and this has a large effect on the nal result of the algorithm, since the weight of each cluster strongly depends on the relative number of points in it.In Fig. 2 we can easily identify where the scanner was placed, as it has many small-volume high-density clusters around it, and a result of the radial dispersion of laser rays from that center to the edges of the image can be seen.
A density readjustment through downsampling, voxeling or other methods could be performed to resolve this issue.Having multiple scans available from different points of view could also improve the result.
We tried downsampling, as well as removing the ground, both separately and together, though to no avail, which is a big surprise.
Additionally, we suspected that since the algorithm involves gradient ascent, it may stop at some local maximum and not a global one.In many cases this is generally avoided due to the vague-to-crisp aspect of the DL approach [17][18][19]55].Nonetheless, this phenomenon was tested by running the algorithm 50 times, each time using randomly initiated cluster centers, and comparing nal likelihood function values.We did not notice any strong difference in the results, which actually illustrates a robustness of our algorithm.
We also tested our DL-based algorithm on a subset (1 room) of an ISPRS point cloud (Case Study #5).Below in Fig. 3 we demonstrate the results using the 5x5 technique.Noise, as previously, is colored gray.
The most interesting and unexpected thing about these results is how well the tabletops and several other entities are clustered and separated from other objects, though sometimes the tabletops belong to several different clusters.This can be seen more clearly from different points of view in Fig. 4, where, additionally, the tabletops are manually gathered into one cluster and recolored in blue for ease of visualization.The divider between tables is visualized in similar fashion and is colored in white.All other entities, which were clustered mediocrely, are removed from the visualization for clarity.
A comparison of clustering results of the outside (Bildstein) and indoor (ISPRS) scans shows that the DL algorithm works better on indoor data, where objects are separated better.There is less of chaotic splitting and overlapping, e.g., when half a tree and half a building are clustered together while the remainders of these objects are in other cluster.Splitting, however, is better than overlapping, as this can be corrected by merging clusters, as was done with the tabletop and presented above.
Relative to other methods, quick computation time of the proposed algorithm is one of the main advantages of using DL.For example, in clustering the 29.7 million points of the Bildstein point cloud over the 3 spatial dimensions, if the number of models (clusters) is preset to M = 25 + 1, then 1 iteration of the main loop of the algorithm is computed in just 34 seconds on our computer con guration.Moreover, setting the maximum main loop iterations to 10-20 generally achieves a decent result, but based on our experience we recommend 40-50.Halving the number of iterations from 40 to 20 reduces computation time by 2x gives a loss of the shifted log-likelihood of around 4% In the same way, the 5x5 technique (restricted to 40 iterations per branch/instance) takes 512 seconds on our computer.Dividing this by 40, we obtain just 12.8 seconds, which is 2.7x faster than the 34 seconds of the rst technique.
According to the DL theory, the computation time for a DL-based algorithm should be linear in the number of points in the cloud, as well as in the number of parameters and clusters.However, our test computations seem to give better results: a sublinear time in all three numbers.This may be explained by use of a graphics processing unit (GPU) on our computer, without which our computation times were linear in the number of points, clusters and parameters.It is possible that computation time is in fact sublinear with a GPU up to a certain data and model size, beyond which the computation time tends towards an asymptote, i.e. becomes linear.But this fact is di cult to verify rigorously, because the GPU would need to be utilized in several capacities.

Comparison With Other Methods
It is important to compare our results to other algorithms with similar goals, e.g.K-means clustering and MeanShift.We used MatLab's internal K-means algorithm, while for MeanShift we used an algorithm from the internet by Bryan Feldman from 2006, which, notably, could be improved signi cantly, as the code is suboptimal.In addition, transferring computation of both these algorithms to the GPU would speed up calculations signi cantly.Nonetheless, for an initial comparison we used these algorithms asis.We use the ≈ 30-million-point Bildstein outside scan with just the XYZ coordinates.
K-means was tested in both "5x5" and "in one go" techniques, analogously to how we tested our algorithm.The tests' runtimes were 7.3 and 11.8 minutes, respectively.Both time results are similar to those of the analogous DL computations.However, in both cases the computations failed to converge in 100 iterations, which not only implies that the result will be subpar, but that such an algorithm is not suited for this kind of input data.Indeed, a plot of the results shows that the resulting clustering is absolutely un tting, because the border lines are too straight and the clusters do not at all correspond to real-life objects: (see Figure 5) Repeating the test with RGB color and intensity did not give better results.
For MeanShift we set bandwidth h = 12, which gives 35 clusters.Runtime was 9.5 minutes, which is very favorable (comparable with DL), considering the suboptimal code.However, the results are useless: (see Figure 6) Another disadvantage of the aforementioned methods is that they do not, to our knowledge, allow for a noise/clutter cluster.Regardless, allowance for such a cluster would not noticeably improve results.

Conclusions
results of applying the DL-based clustering method and algorithm we developed to laser scanning point clouds are described.Two examples were used for testing: scans of a village street and inside an o ce space.Testing of the algorithm showed that it is able to work quickly with clouds of millions of points in size.The best results were obtained by taking into account only the 3 spatial coordinates, i.e. ignoring the color and intensity of re ected laser beams.However, these parameters could be useful in other cases, for example, for scans of historical heritage sites and agricultural elds.Two clustering approaches have been tested.In the rst approach, segmentation is carried out simultaneously for 25 models (plus noise) in 1 run of the algorithm (the "in one go" approach).In the second approach we use the so-called 5x5 technique, where a point cloud is also clustered into 25 models + noise.However, in the second approach 2 consecutive runs of the algorithm are needed.During the rst run an initial segmentation into 5 clusters + noise is performed.During the second run each of the 5 initial non-noise clusters is divided into a further 5. Testing with the original Bildstein point cloud as input shows that the 5x5 technique allows clustering to be performed 2.7 times faster than with the "in one go" approach while preserving the quality of the result (or even beating it).The 5x5 DL technique clusters the 29.7 million points of the Bildstein scan in only about 8 minutes.
The above-adequate segmentation of the dividers and countertop in the ISPRS o ce space scan stands out particularly, giving a better result than with Bildstein.DL-based clustering results in point clouds being divided into clusters that make sense and those that don't.To overcome this disadvantage and to improve the result, additional approaches have been tested: downsampling, removing the ground, as well as a combination of the two.Surprisingly, this shows no improvement.The possibility that during gradient ascent a non-global local maximum is reached has also been tested, yet our experiments refute these doubts with high certainty: the algorithm was run 50 times with arbitrary initializations, with approximately the same nal likelihood resulting each time.

Ethics approval
We con rm that any aspect of the work covered in this manuscript that has involved either experimental animals or human patients has been conducted with the ethical approval of all relevant

Funding
The research is partially funded by the Ministry of Science and Higher Education of the Russian Federation as part of World-class Research Center program: Advanced Digital Technologies (contract No. 075-15-2020-934 dated 17.11.2020)Author information A liations Yevgeny Milanov, Vladimir Badenko, Vladimir Yadykin: Peter the Great St.Petersburg Polytechnic University, Polytechnicheskaya 29, 195251 St.Petersburg, Russia; e-mails: calc881@hotmail.com,vbadenko@gmail.com,v.yadikin@gmail.com,Leonid Perlovsky: Northeastern University, Boston, Massachusetts, United States; e-mail: lperl@rcn.comContributions Yevgeny Milanov: algorithm&software development and computer experiments, draft version editing; Vladimir Badenko: research supervising, nal editing and testing; Vladimir Yadykin: data obtaining for computer experiments, interpretation of results, algorithm development; Leonid Perlovsky: mail idea, key algorithm, nal editing.All authors have read the article, partially edited it and agree with it contentю Corresponding author: Vladimir Badenko Correspondence to: vbadenko@gmail.comEthics declarations

Figures
Figures

Figure 1 Comparison
Figure 1

Figure 2 Results
Figure 2

Figure 3 Result
Figure 3