The Role of Non-RD Expenditures in Promoting Innovation in Europe

In this article we estimate the value of “ Non-R&D Innovation Expenditures ” in Europe. We use data from the European Innovation Scoreboard-EIS of the European Commission from the period 2010-2019. We test data with the following econometric models i.e.: Pooled OLS, Dynamic Panel, Panel Data with Fixed Effects, Panel Data with Random Effects, WLS. We found that “ Non-R&D Innovation Expenditures ” is positively associated among others to “ Innovation Index ” and “ Firm Investments ” and negatively associated among others to “ Human Resources ” and “ Government Procurement of Advanced Technology Products”. We use the k-Means algorithm with either the Silhouette Coefficient and the Elbow Method in a confrontation with the network analysis optimized with the Distance of Manhattan and we find that the optimal number of clusters is four. Furthermore, we propose a confrontation among eight machine learning algorithms to predict the level of “ Non-R&D Innovation Expenditures ” either with Original Data-OD either with Augmented Data-AD. We found that Gradient Boost Trees Regression is the best predictor for OD while Tree Ensemble Regression is the best Predictor for AD. Finally, we verify that the prediction with AD is more efficient of that with OD with a reduction in the average value of statistical errors equal to 40,50%.


Introduction-Research Question
The non-R&D Expenditures variable in European countries is analyzed below. This variable considers the type of technological innovation that is achieved without investing in research and development. The non-R&D Expenditure in European countries is analyzed below. This variable considers that type of technological innovation that is achieved without investing in research and development. In fact, part of the expenditure for innovation is carried out without moving research and development as for example in the case of investments in machinery and equipment, or in the case of the acquisition of patents and licenses. However, through this expenditure it is possible to consider the diffusion of new technologies and new ideas in production systems. In this regard, we propose a distinction between strong technological innovation or technological innovation constituted using research and development and weak technological innovation or that type of innovation obtained through non-R&D expenditures. The differences between the two types of innovations are significant and concern not only the products and services generated by the innovation but also the socio-economic dimension of technological innovation. In fact, where on the one hand strong technological innovation is achieved by large companies that can invest in the construction of large research centers, on the other hand weak technological innovation is achieved in small and medium-sized enterprises that lack resources financial to invest directly in research and development. However, there is a close link between strong technological innovation and weak technological innovation at least within countries that have intermediate levels of technological innovation. In fact, for these types of countries that are not really champions in technological innovation, then the two forms of innovation interpenetrate, complement, and reinforce each other. This two-way relationship is since for non-R&D expenditure to be established it is necessary that there are positive externalities generated by the investment in R&D that create that climate favorable to technological innovation that produces the forms of open innovation, of collaboration and cooperation between organizations, companies and institutions and creates a favorable environment for technological innovation. However, for countries that are heavily invested in research and development, too great a gap is created between large and medium-sized manufacturing companies that export high-tech products and small and medium-sized enterprises. In this case, small and medium-sized enterprises have difficulty in carrying out non-R&D activities and lack the ability to grasp the positive externalities deriving from the R&D investments of large industrial companies. In any case, regardless of the level of technological innovation, there is still a role for non-R&D expenditures which consists in allowing small and mediumsized enterprises to interpret the innovations present in the market and introduce them into their organizational structure, generating a growth in competitiveness that often it is not due to product innovation but rather to process innovation. The article continues as follows: in the second paragraph a synthetic analysis of the scientific literature is presented, in the third paragraph the results of the econometric analysis are indicated and discussed, the fourth paragraph analyzes the results of clustering with the k-Means algorithm, the fifth paragraph proposes a network analysis with adjustment and optimization of clustering results with k-Means algorithm, the sixth paragraph contains the comparison between eight machine learning algorithms for the prediction of non-R&D expenditures with Original-Data, the seventh paragraph shows the results of the comparison between eight machine learning algorithms trained with Augmented Data-AD, the eighth paragraph concludes.

Literature Review
An analysis of some articles in the literature considered valid for framing the question of the role of non-R&D expenditures in the context of national innovation systems is briefly presented below.
[1] consider the role of government research subsidies and non-research technological innovation, R&D investments, and the performance of listed pharmaceutical companies in China in the period between 2009 and 2015. The results show that government subsidies have a positive impact on research and development while having zero impact on technological innovation. On the contrary, investments in Research and Development have a positive impact on technological innovation not deriving from research. The authors verify that the positive impact of R&D investments have a positive impact on technological innovation persists in both public and private enterprises. [2] afford the question of the role of non-R & D innovation activities within the Total Factor Productivity-TFP. The authors therefore propose a model to consider both the effects of R&D activities and the effects of non-R&D activities on firm productivity levels. The results of the analysis calculated in the context of the EU-26 in the period  show that the distinction between R&D and non-R&D is very relevant for the calculation of Total Factor Productivity. [3] analyze the relationship between investment in Non-R&D Innovation Activities and the impact on regional innovation performance in China. The authors verify the existence of a negative relationship between the value of investment in non-R&D innovation activities and the value of innovation performance both in Chinese regions with a high level of innovation and in Chinese regions with low levels of technological innovation. The authors suggest that the negative impact of Non-R&D innovation activities on technological innovation is minimal only for those regions that have an intermediate value of orientation towards technological innovation. The authors criticize European economic policies based on growth in investment in research and development [4]. In fact, although the idea of increasing expenditure on research and development as a percentage of GDP to 3% is laudable, there are also doubts about the effective effectiveness of this economic policy. In fact, research and development does not depend only on the growth of the value of investments but rather also on a set of contextual elements such as cooperation between companies and research centers and on innovation that does not derive from research and development. The authors therefore believe that it is necessary to reformulate the economic policies of research and development to also grasp the contextual and environmental elements with a role also for non-R&D innovation. [5] consider the role of Non-R&D Innovation activities in the development of technological innovations and competitiveness for German small and medium-sized enterprises. [6] analyze the characteristics of research and development spending in some ASEAN countries, namely: Indonesia, Laos, Thailand, the Philippines, and Vietnam. The authors divided the companies analyzed into two groups, namely: companies that invest in research and development activities and companies that invest in Non-R&D Innovation activities. The results of the analysis show the differences between companies that focus on research and development and companies that invest in non-R&D innovation activities. On the one hand, the companies that invest in Research and Development constitute cross-functional teams of production, engineering, marketing, and IT, while the group that invests in non-R&D innovation activities invests in the development of human resources and quality certifications. Through this analysis, the authors verify that there are many Asian SMEs that can produce innovation without having research and development-oriented departments. [7] consider the impact of non-R&D spending on Chinese state-owned industrial enterprises through a regional panel analysis. The authors calculate the impact of non-R&D innovation activities of stateowned industrial enterprises on the development of Total Factor Productivity in Chinese regions. The results show that the Non-R&D Innovation Expenditures of state-owned enterprises in China have a positive impact on economic growth. [8] focuses on the role of non-R&D innovation expenditures in the development of technological innovation in India from a neo-Schumpeterian perspective. The analysis considers the period between 1981 and 2017. The results show that: • innovation in manufacturing companies depends on factors that go beyond investment in Research and Development; • Research and Development alone cannot explain the innovation processes in newly industrialized countries such as India; • National innovation systems play a decisive role in the development of technological innovation at the country level. Finally, in the breakdown of the effects that promote non-R&D innovation, the author verifies the predominant role of human capital and remittances from abroad. [9] verify the existence of a positive role of non-R & D innovation expenditures in promoting innovation in the European regions of NUTS2. [10] compares the performance and characteristics of technological innovation of firms investing in R&D versus firms investing in non-R&D innovation. The analysis is conducted on a sample of 1392 Chinese manufacturing companies. The results show that on the one hand the companies that produce non-R&D innovation focus on the knowledge present in the company to produce innovation, while on the other hand the companies that invest in R&D access external networks thanks to the use of scientific collaborations and feedback. of suppliers. Furthermore, companies that carry out non-R&D innovation tend to use the substitutability of external and internal factors to produce process innovation, while companies oriented towards R&D tend to replace internal and external resources applied to product innovation. [11] refer to the role of investments in research and development and investments of foreign companies on companies that do not carry out research and development activities considering three regions of Belgium between 2000 and 2017. The results show that companies that invest in research and Development can produce positive effects on the innovative activity even of companies that are devoid of investments in R&D. On the other hand, the impact of foreign direct investments in terms of increasing the innovativeness of non-R&D firms is reduced. The authors conclude with suggestions of regional economic policy of technological innovation by proposing interventions aimed at varying the methodologies for promoting spillover effects based on regional characteristics. [12] consider the role of non-R&D intangible capital in determining the Total Factor Productivity-TFP. The econometric analysis is conducted for 13 developed countries in the period between 1995 and 2010. The results show that the growth of non-R&D intangible capital has a positive effect in terms of growth of Total Factor Productivity. Furthermore, the outputs show that non-R&D intangible capital have a positive impact in terms of spillovers effect on various industries and that the growth of R&D and ICT have a positive impact in determining non-R&D intangible capital. [13] show that SMEs can growth in innovation through non-R&D activities even in a context of high-performing firms that invest in human capital with STEM characteristics. The authors use data from German SMEs. [14] consider that while the differences in terms of technological innovation between Eastern European countries and other European countries are significant, there are some variables such as non-R&D Innovation Expenditures, for which the same areas converge. [15] consider the role of non-R&D activities in developing innovative performance in a set of 329 Chinese small and medium-sized enterprises operating in the manufacturing sector. The authors break down non-R&D activities into three components: "technology adoption", "imitation and minor modification", and "innovative marketing". The authors verify that absorptive capacity has a positive impact in positively linking "innovative marketing" with innovation performance. [16] emphasize the positive role that investing in non-R&D activities has in promoting intangible capital. [17] highlight the role of fiscal policies, taxation, and incentives as a tool for promoting research and development within the European Union countries. The authors verify that where there are important fiscal policies in favor of research and development, investment in non-R&D innovation is significantly reduced.

The Econometric Models for the Estimation of the Non-R&D Innovation Expenditures
In the following analysis we present an econometric model for estimating non-R&D Innovation Expenditures in Europe. The data are obtained through the analysis of the European Innovation Scoreboard-EIS database of the European Union for 36 countries 4 in the period 2010-2019. The data were analyzed using a set of econometric models, namely Pooled OLS, WLS, Panel Data with Fixed Effects, Panel Data with Random Effects and Dynamic Panel at 1 Stage. Specifically, the following equation was estimated: The value of Non-R&D Innovation Expenditures is positively associated with: • Innovation Index: is a variable that considers overall the value of performance in terms of technological innovation at the country level. The indicator takes into consideration all the elements of an institutional, financial, entrepreneurial type and relating to human capital that may have some impact in terms of technological innovation. Obviously, there is a positive relationship between the value of the innovation index and the value of non-R&D expenditures since generally the countries that invest more in technological innovations, whether they are supported by research and development, have precisely the best performances in terms of innovation [18]. • Firm Investments: is a variable that considers three indicators, namely investments in R&D, investments in non-R&D expenditures and the ability of companies to increase the IT skills of their employees. Obviously, there is a positive relationship between this variable and the value of non-R&D expenditures both because this variable is constituted by the indicator, and because many small and medium-sized enterprises cannot afford to create research and development departments and therefore must implement forms of non-R&D activity [19]. • Foreign-Controlled Enterprises-Share of Value Added: is a variable that considers the added value of foreign-controlled companies in millions of euros. Companies operating in the financial sector are excluded from the indicator. Foreign-controlled companies are companies that have their headquarters in another country other than the one considered. There is a positive relationship between the presence of foreign-controlled companies and non-R & D expenditure. This relationship can be understood considering that foreign-controlled companies are generally medium-large companies that invest significantly in research and development. In other words, they are companies that contribute significantly to the construction of an institutional, social, and economic environment favorable to scientific research. On the contrary, the companies that make non-R&D expenditure are small and medium-sized enterprises which however take advantage of the spillovers generated by large foreign-controlled companies. It is therefore a question of that systemic, environmental and contamination effect that allows the strengthening of national systems of technological innovation. • New Doctorate Graduates: is an indicator that measures the supply of new second level graduates in all training fields. The indicator also captures the number of PhD students for most countries.
There is a positive relationship between the value of the offer of new graduates and PhD students and the Non-R&D Expenditures. This relationship stems from the fact that there is a connection between the value of human capital and the innovations resulting from non-R & D expenditures. However, it must be considered that this relationship does not operate directly but rather indirectly, that is: human capital has a positive impact on R&D and through this path creates the conditions to increase the value of non-R&D expenditures. In fact, it should be considered that non-R & D expenditures is a context variable that is valued above all where technological innovation systems are significantly oriented towards R&D [20].   Tertiary education A53 1,0575 *** 0,8518 *** 0,8418 *** 0,7999 *** 0,904 9 *** 0,8912

Non-R&D innovation expenditure
The value of Non-R&D Innovation Expenditures is negatively associated with: • Finance and Support: is a variable consisting of two sub-variables, namely "R&D expenditure in the public sector" and "Venture capital expenditures". That is, it is a variable that sums up the forms of public or private funding for research and development. There is a negative relationship between the development of a financial system aimed at supporting research and development and non-R&D expenditures. This relationship can be understood considering that expenditure on technological innovation can be divided into two different types, i.e. weak innovation which is represented by non-R&D expenditures and strong innovation constituted instead by investment in R&D. Obviously, countries that have more advanced financial systems, that have more efficient financial markets and larger companies on average tend to support strong technological innovation achieved with R&D rather than weak technological innovation achieved with non-R&D expenditures [22]. • International Scientific Co-Publications: is an indicator that considers the number of scientific publications that have at least one foreign co-author. These publications are considered as a proxy of the quality of scientific research. There is a negative relationship between the value of international scientific publications and the value of non-R&D expenditure. This relationship can be better understood considering that while international scientific publications are produced by large universities or large research centers, non-R&D expenditure is typical of SMEs. And generally, in countries where there are large universities and large research centers, the system tends to be oriented more towards strong technological innovation -R&D and less towards weak technological innovation, non-R&D expenditures [23]. • R&D Expenditure Business Sector: is an indicator that considers the overall expenditure on R&D in the whole private sector compared to the Gross Domestic Product. It is therefore an indicator that captures the creation of new knowledge by companies. This indicator is particularly useful in those industrial sectors that are significantly aimed at scientific research, i.e. the pharmaceutical sector, the chemical sector and electronics. Obviously, there is a negative relationship between the value of private sector R&D spending as a percentage of GDP and the value of non-R & D expenditures. As we have already underlined in the previous points, non-R & D expenditures is typically associated with the presence of small and medium-sized enterprises while business R&D investment is typical of large and very large industrial organizations [24]. • Enterprises Providing ICT Training: is an indicator that considers the number of companies that have developed the ICT skills of their employees with respect to the total number of companies. ICT skills are essential for the development of the digital economy in the knowledge and information economy. Furthermore, this indicator is considered in the EIS-European Innovation Scoreboard as a proxy of the ability of companies to improve the skills of employees in a broad sense. There is a negative relationship between the ability of companies to develop employees' ICT skills and the overall value of non-R&D Expenditures. This negative relationship is since employees' ICT skills generally support strong technological innovation, i.e. the type of innovation that emanates directly from R&D [25]. • Medium and High-Tech Product Exports: is an indicator that considers the technological competitiveness of the EU or the ability to commercialize products and services that are the result of investment in research and development. This value makes it possible to enhance the new technologies that are considered vital for the competitiveness of the countries. Medium-and hightech products are essential for economic development, for the growth of productivity and wellbeing and for the development of well-paid employment. There is therefore a negative relationship between the export value of medium and high technology products and the value of non-R & D expenditures. This negative relationship can be understood considering obviously that countries that export medium and high technology products invest more in R&D than non-R&D activities [26]. • Most-Cited Publications: is the number of scientific publications that fall into the top-10% in terms of citations worldwide out of the total number of scientific publications. This indicator is considered a measure of the efficiency of research systems as the most cited scientific publications are of a high standard. There is a negative relationship between the value of the most cited scientific publications and the value of non-R&D expenditure at country level. This relationship can be better understood considering that obviously in countries where scientific research systems are more competitive there is also an orientation towards R&D rather than non-R&D activities. As we have already pointed out, non-R&D activities are typical of small and medium-sized enterprises which generally do not have research and development departments and therefore are unable to use the research outputs of top scientists or top universities. • Intellectual Assets: is a variable made up of three sub-variables, namely "PCT patent Applications", "Trademark Applications", "Design Applications". There is a negative relationship between the "Intellectual Assets" variable and the value of non-R&D Expenditures. This negative relationship can be better understood considering that intellectual goods are generally produced because of investment in that kind of strong technological innovation that relates to R&D. However, in this regard, the only "Trademark Applications" that we have previously found is an exception, which instead has a positive relationship with non-R&D expenditures. However, if "Trademark Applications" is added to other forms of intellectual assets, the overall value is negatively associated with the value of non-R&D expenditures [27]. • Human Resources: is a variable made up of the sum of three sub-variables, namely: "New Doctorate Graduates", "Population aged 25-34 with tertiary education" and "Lifelong learning". There is a negative relationship between the value of Human Resources and the value of non-R&D Expenditures. However, the data is counterfactual and seems to be contradictory with respect to the analysis of the previous points. In fact, if the constituent sub-variables of "Human Resources" are taken individually, it is possible to verify that there is a positive relationship between the individual variables and the value of non-R&D expenditures. This condition can be understood considering that countries that have very high levels of Human Resources obviously tend to invest more in R&D rather than non-R&D activities. In summary, we must therefore conclude that if there is a too high level of human capital then the impact on non-R&D expenditures is negative, while if human capital is moderately developed in the individual Human Resources sub-variables, then the impact on non-R&D expenditures is positive [28]. • Government Procurement of Advanced Technology Products: is an indicator that considers the government's ability to promote technological innovation through purchases. In other words, a low value is assigned to the variable if the government chooses its technological supplies only based on price, on the contrary a value of 7 is assigned if the criterion of performance and innovation is chosen in the choice of supplies. There is a negative relationship between the value of the government's ability to stimulate technological innovation with its purchases and non-R&D expenditures. Obviously, governments that are more oriented towards making purchases based on innovation will also tend to privilege R&D products while discarding companies that have optimized non-R&D activities.

Rankings and Clusterization with k-Means Algorithm
In this paragraph we first present the rankings of countries by value of non-R & D expenditures and then we present an analysis of the concretization with the k-Means algorithm to verify the presence of clusters in the data.  Considering the value of the median of the countries by value of the Non-R & D Innovation Expenditures, it appears that the median value of the countries in cluster 2 is higher than cluster 1. In particular, the following ordering of clusters is shown: <2 = 142.65 > <1 = 69.74. However, to have a further comparison on the accuracy of the number of clusters, a further clustering model was created using the Elbow method. Using the Elbow method, it is possible to verify the presence of three different clusters, namely: • Considering the comparison between the optimization of the k-Means algorithm with the Silhouette coefficient and the optimization of the k-Means algorithm with the Elbow method, we choose the optimization with the Elbow method. This choice is due since the heterogeneity of European economies is such as to require a number of clusters greater than 2. In fact, since there is a relationship between growth in gross domestic product and investment in technological innovation, and since there are enormous per capita income gaps between the various European countries it follows that the optimization of the k-Means algorithm with the largest number of clusters is the preferred one.

Network Analysis with the Distance of Manhattan as a Tool to a Further Optimization of Clusterization with k-Means Algorithm
Since the k-Means algorithm is unsupervised, it follows that the decision about the number of clusters rests with the analyst. Furthermore, the comparison between the Silhouette coefficient and the Elbow Method may be insufficient to identify the optimal number of clusters, although using two methods is certainly better than using only one. In this regard, as further proof of the structure of the clusters, a network analysis is presented with the use of the Manhattan distance. In particular, in the following analysis we try to understand whether the detected network structures, i.e. those structures that have a node value greater than three, are coherent or not with the cluster structure obtained by comparing the Silhouette Coefficient and the Elbow Method as in the previous paragraph. Obviously, if the network analysis is not coherent with the optimization of the k-Means algorithm then we propose a modification of the number of k to maximize the ability of design the optimal number of clusters.
The application of the algorithm shows that there are the analysis shows that there are four complex network structures and three structures with simplified networks. There is a relationship between Ireland, France, Austria, Iceland, Italy, Belgium, Ukraine, and Slovenia.  • Norway and the United Kingdom are connected with a value of 0.28 units. If we compare the network analysis network optimized with the distance of Manhattan with the clusterization made with the K-means algorithm optimized with the silhouette coefficient we can verify that the Cluster 1 is made up of two complex networks structures. It follows that by further optimizing the K-means algorithm in the light of the results of the complex network structure it appears that the optimal number of Clusters is 4. In fact, putting k = 4 in the k-Means algorithm is possible to obtain clusters that reflect in the best way the complex network structures analyzed with the distance of Manhattan. Therefore, we propose clusterization with the k-Means algorithm with a number of clusters equal to 4 as resulting from the comparison between the clustering with the Elbow method and the network analysis optimized with the Manhattan distance. •

Prediction with Original Data-OD
A comparison of eight machine learning algorithms for predicting the future value of "Non-R & D Innovation Expenditures" is presented below. The algorithms are ordered according to the predictive performance calculated in terms of minimization of statistical errors and maximization of R-Squared. The statistical errors used are: Mean absolute error, Mean squared error and Root mean squared error. The algorithms were trained with 70% of the available data while 30% of the data was used for actual prediction. For the evaluation of the algorithms, rankings are identified. The value of the positioning of each algorithm in each of the four rankings is then calculated, i.e. for R-Squared, Mean absolute error, Mean squared error and Root mean squared error. Finally, the rankings are added up and the algorithm with the lowest ranking is chosen, which is therefore the highest ranking in the rankings combinations. Therefore, the following ordering of the algorithms derives, namely: • Gradient Boosted Trees Regression with a payoff of 5; • Tree Ensemble Regression with a payoff of 10; • PNN-Probabilistic Neural Network with a payoff value of 12; • Random Forest with a payoff value of 14; • ANN-Artificial Neural Network with a payoff value of 19; • Polynomial Regression with a payoff value of 25; • Linear Regression with a payoff value of 27; • Simple Regression Tree with a payoff value of 32. Therefore by applying the best predictor algorithm or the Gradient Boosted Trees Regression it is possible to verify the following predictions:

Prediction with Augmented Data-AD
A further prediction is then made through the use of augmented data. The augmented data are obtained by adding the prediction to the original data. Therefore, as indicated in the previous paragraph, the prediction made with the best predictor algorithm or the "Gradient Boosted Trees Regression" is added to the time series. The same analytical process of the previous paragraph is then repeated. The algorithms are trained using 70% of the available data. The remaining 30% is used for prediction. The algorithms are classified according to their performance in terms of reduction of statistical errors or Mean absolute error, Mean squared error and Root mean squared error and maximization of R-squared. The following ordering of the algorithms by predictive capacity is then determined, that is: • Tree Ensemble Regression with a payoff value of 4; • Random Forest Regression with a payoff value of 8; • Gradient Boosted Trees Regression with a payoff value of 12; • PNN with a payoff value of 16; • ANN with a payoff value of 20; • Polynomial Regression with a payoff value of 24; • Simple Regression Tree with a payoff value of 28; • Linear Regression with a payoff value of 32. Therefore, by applying the best predictor algorithm or the Tree Ensemble Regression it is possible to make the following predictions:  In a statistical comparison between the prediction made using the Original Data-AD and the Augmented Data-AD, it appears that:

Statistical Measures with Augmented Data
• the R-Squared value increases by 0.7 units in an absolute sense corresponding to a value of 2041%; • the value of the Mean Absolute Error decreases from 0.1762311 to 0.1211028 with an absolute change equal to an amount of -0.0551283 and equivalent percentage from 31.3%; • the Mean Squared Error value goes from 0.0665028 to 0.0234824 or an absolute variation of -0.0430204 equal to a percentage variation of -64.6895961%; • the Root Mean Squared Error goes from an amount of 0.2578814 up to a value of 0.1532398 or equal to a variation of -0.1046417 equal to an amount of -40.6%; • On average, the value of statistical errors is reduced by about 40.5%. It therefore follows that the prediction with the Augmented Data-AD is much more efficient than the prediction with the Original Data-OD from the point of view of maximizing the R-squared and minimizing statistical errors. We have found that "Non-R&D Innovation Expenditures" is positively associated among others to "Innovation Index" and "Firm Investments" and negatively associated among others to "Human Resources" and "Government Procurement of Advanced Technology Products". In the light of the analysis of the literature, it appears that the value of non-R & D Expenditures is essential to allow small and medium-sized enterprises to connect to the digital transformation and to participate, albeit marginally, in technological innovation systems at national level. Obviously, small and medium-sized enterprises do not have the possibility of setting up research and development departments as is the case in large industrial enterprises and therefore must try to optimize the positive externalities that derive from participating in innovationoriented economic systems. In this sense, the phenomena of open innovation and cooperation between organizations and companies are very important to promote a culture of innovation also among small and medium-sized enterprises. Furthermore, we have used the k-Means algorithm with either the Silhouette Coefficient and the Elbow Method in a confrontation with the network analysis optimized with the Distance of Manhattan and we have found that the optimal number of clusters is four. From the clustering analysis it is evident that non-R&D Expenditures are typical of countries that are at low or intermediate levels in research and development and technological innovation. Furthermore, we propose a confrontation among eight machine learning algorithms to predict the level of "Non-R&D Innovation Expenditures" either with Original Data-OD either with Augmented Data-AD. We found that Gradient Boost Trees Regression is the best predictor for OD while Tree Ensemble Regression is the best Predictor for AD. Finally, we verify that the prediction with AD is more efficient of that with OD with a reduction in the average value of statistical errors equal to 40,50%. Overall, the predictive analysis carried out with Augmented Data-AD predicts an increase in the value of non-R&D expenditures in the countries considered.

Confrontation between Original Data-OD and Augmented Data-AD in Terms of Statistical Efficiency
In summary, from the point of view of economic policies it is necessary to consider that the promotion of non-R&D innovation expenditures must take place locally and, if possible, regionally, having knowledge for those who are the drivers of innovation. In fact, not all European countries and regions could benefit from investing in non-R&D activities especially if they already have an R&D orientation with excellent human capital.

Declarations
Data Availability Statement. The data presented in this study are available on request from the corresponding author.
Funding. The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Competing Interest. The authors declare that there is no conflict of interests regarding the publication of this manuscript. In addition, the ethical issues, including plagiarism, informed consent, misconduct, data fabrication and/or falsification, double publication.
Software. The authors have used the following software: Gretl for the econometric models, Orange for clusterization and network analysis, and KNIME for machine learning and predictions. They are all free version without licenses.