Analysis of business companies based on artificial neural networks

Business companies have many kinds of products that they sell to other businesses, consumers, etc. They are a driving force of economies, especially in developing countries. The aim of this article is to analyse business companies in the Czech Republic using artificial neural networks and subsequently to estimate the development of this branch of the national economy. An analysis is performed to create a significant number of clusters of businesses. An analysis of the most significant clusters is also carried out. The result can be generalized and we can predict the number of companies that will be creditworthy or bankrupt in the following period. This makes it possible to estimate not only the overall growth or decline of business companies in the Czech Republic, but also to estimate the structure of the companies in terms of their size, turnover or volume of sales.


Introduction
The primary strategic objective of every company should always be to increase its value. The value of a company is very important. There is a need not only to increase the wealth of individual shareholders but also to support the interests of other stakeholders [1]. Therefore, it is very important to analyse and predict the financial sustainability of the company. According to Grant [2], there are many different methods to this end. However, some methods are preferred, especially due to their ease of use, flexibility and versatility. However, there are also methods which are difficult to use and require skills that many analysts lack. Skalický and Puchýř [3] argue that one of the ways to analyse the potential risks arising from the company's financial data and how to oversee the financial health of a company is financial analysis. In advanced market economies, financial analysis has a long tradition. Among company assessment methods, we can also include benchmarking based on non-financial indicators. These indicators should predict the company's future potential and also evaluate its performance [4].
Economic Value Added (EVA) is another indicator that will allow us to uncover the success of a company on the market [5]. The importance of this indicator is increasing as it is used in practical management of a company as a motivation factor for managers and in analysing benchmarking [6]. It is also possible to evaluate the company, for example, according to bankruptcy and creditworthiness models [7]. Companies can be evaluated by other methods, such as using value generators, to allow more accurate measurement of the efficiency of the processes and elements that generate the value. According to Valašková et al. [8], these approaches are referred to as value factors. Value generators can be defined as factors that contribute to value creation in a market evaluation of a company. Knowledge is necessary for involved parties to determine future and current internal values of shares [9]. Value factors allow to reflect the environment in which business entities operate and are able to measure the performance of the company and the value of its corporate strategy [10].
Another method for evaluating companies is by using neural networks, which can also be applied in business practice. The advantage of artificial neural networks is their ability to predict future periods. Sánchez and Melin [11] state that neural networks have a huge application in many different areas. Artificial neural networks have innumerable advantages over conventional methods. They allow you to analyse complicated patterns quickly and with high precision and, according to Santin [12], they are flexible in their own use. The disadvantage of these networks is the need for large sample data, since many test observations are needed to produce such data, however this is very complicated for users [13].
Kohonen networks are models of neural networks that can be used to configure data sets into different categories. The data is configured so that the data within one category tends to be the same and the data in different categories are different. A large number of experimental results shows that Kohonen's networks are very effective for business valuation [14]. According to Horák, Vochozka and Machová [15], the Kohonen network consists of an input layer that is connected with the output layer and is taught without a teacher, by selforganization. It can be said that the network is widely used because it is an alternative network that is usable for most neural network calculations. It is primarily used for managing photos, videos, audio, application security, speech processing, and high-dimensional data projection on lower-dimensional data [16].
In this paper, business companies will be analysed by Kohonen Networks. Businesses companies are companies that have many kinds of products that are sold for the needs of other businesses, consumers, etc. According to Dyhdalewicz [17], business companies keep a store, buy a huge amount of products, and buy and supply goods to customers. According to Doganay and Kocsoy [18], business companies are the driving force of economies, especially in developing countries. These business companies are able to connect sellers with buyers and earn revenue through sales commissions.
The aim of the paper is to perform a cluster analysis of businesses operating in trade.

Data and methods
For the purposes of this contribution, a set of data on 11,604 retail and wholesale businesses in 2016 will be created. These will be businesses with the main economic activity classified in Section G of economic classification of CZ NACE activities. The data set will be generated from the Albertina database. The complete financial statements of these companies will be available (all financial information is in CZK thousands). For the analysis, only some items will be used: 1. This is information about the size of the business, namely the volume of assets (fixed, current, other). 2. Fixed assets: the item describes the degree of involvement of fixed assets and gives an idea of the level of technology used in production. The higher the share of fixed assets, the higher the production automation can be expected. 3. Current assets: represent the amount of funds, receivables and inventory in the business. It is in most cases the so-called working capital, i.e. those asset components that change their substance in less than one year and which are directly involved in the core business activity. 4. Equity: reflect the level of business risk for the business owners. This is the amount of funds the owners may lose in the event of bankruptcy. 5. External capital: This is capital that does not have influence on the company's business management. In a way, it reflects the external view of the potential success of the business. 6. Value Added: This is the value added by the company by its core business activity to basic input -material. 7. Operating income: Provides information on how the business is successful in its core business activity. 8. ROE: return on equity means an appreciation of the equity of a business, that is, the capital provided by the business owners. 9. Profit before tax: describes the success or failure of a company in its entire business activity. These are not generators of value but rather characteristics that may have, albeit indirect, influence on the performance of a business.
The file will then be subjected to cluster analysis, specifically using Kohonen networks. For cluster analysis, Dell's Statistica software in version 12 will be used. Data Mining will be used as a specific Neural Network tool. Here we select neural networks without a teacher -Kohonen's networks. We specify the data for analysis. In all cases, these are continuous predictors. The file will be divided into three parts: 1. Training dataset: represents 70% of companies in the set. The Kohonen network will be created based on this dataset. 2. Testing dataset: this is 15% of the companies of the original set. Using this data set, we verify the parameters of the Kohonen network. 3. Validation dataset: this will also involve 15% of companies in the file. Using this dataset, we will also verify the Kohonen network to see whether it is or is not usable. We determine the topological length and the topological width of the Kohonen map at 10. The number of iterations of the calculation will be set at 100,000. However, I shall recall that the level of error is decisive. Unless there is any iteration to improve the Kohonen network parameters, training will be terminated before the 100,000 iteration is completed. If the network parameters are improved even at 100,000 iterations, we need to repeat the process and set the higher value of the required iterations to make sure the result is the best possible.

Results
In accordance with the methodology of the contribution, the data was divided into three setstraining, testing and validation. The descriptive characteristics of all three sets and the original data set are given in Table 1. Ideally, if the characteristics of the subsets of data were approximately the same. Since businesses have been randomly distributed, this is not the case. However, the random distribution of data may not be a defect in the result.

Cluster (1, 10)
The cluster (1, 10) of the topological grid is occupied by 1,726 companies. It is the largest set of data from partial clusters. From the point of view of input parameters, we can derive the following average values: 1. It is a small company whose operating income is minimal, 101 thousand CZK per year. It generates a pre-tax profit on average of only 1.3 thousand CZK. Nevertheless, it has assets of CZK 3.3 million. However, the assets are financed for the most part by external capital. In particular, such a company has more than CZK 2.6 million of external capital. Value added is also not high. It is slightly over 680,000 CZK. Nevertheless, the existence of the business is not jeopardized. Based on the nature of its business activity, the company holds a relatively large volume of current assets (probably goods) in excess of CZK 2.7 million.
When it comes to a homogeneous cluster of businesses, we can verify the validity of basic logical links. The relationship between total assets and the amount of operating income appears interesting (see Figure 3 below). Total assets and operating income are correlated to more than 0.08. Therefore, there can be no dependence of both variables. Yet the data is intersected by a regression curve, a 6th order polynomial. Based on this, it is possible to estimate the development of the operating profit on the basis of the size of the company determined by the volume of total assets. Figure 4 provides a relationship between fixed assets and operating income. The figure shows that there is virtually no dependency between the data. The correlation coefficient is 0.06. By default, the volume of fixed assets represents the degree of automation in the enterprise. In the case of trade business companies, it rather provides information about the ownership of the factors of productionnamely, onethe real estate in which the business is located, if it is needed to carry out the business activities. A number of business transactions can go hand in hand with real estatesuch as the operation of e-shops. Figure 4 illustrates that trade business enterprises often do not need long-term assets for their business, possibly only in the minimum amount (for example, in the form of a car. Figure 5 thus presents the relationship between external capital and operating income. The relationship can show us the effect of a financial leverage, but only assuming that there will be a relatively high correlation between the two variables. However, it's only 0.25. Yet, the graph points are intersected by a 6th order polynomial. However, we can see from the graph in the figure that businesses are in unhealthy debt. Additional external capital does not increase the return on equity. Both variables are indifferent and point to a poor capital structure and perhaps even unnecessary assets in the enterprise. Companies in this cluster should reduce the involvement of external debt.

Cluster (2, 10)
The cluster (2, 10) of the topological grid occupies the imaginary second position in frequency. There are 1,138 companies. From the point of view of the total input parameters we can deduce a typical representative of this cluster: 11. Total assets: 6 668.9 mil. CZK. 12.  (1,10), meaning from the point of view of the volume of total assets. It is an enterprise that employs more than 5.8 million CZK worth of assets. The company owns fixed assets in the amount of almost CZK 1.2 million. At the same time, it manages current assets in excess of CZK 4.5 million. Its assets are funded by more than CZK 2.2 million from its own resources, with the remaining amount, i.e. almost CZK 3.6 million, coming from external capital. It achieves an operating income over 850 thousand CZK, which represents almost 18 thousand CZK profit before tax. In addition to the realized profit, a certain shift in the company of this cluster compared to the cluster (1, 10) is a lower indebtedness rate. The companies of cluster (1, 10) had a debt ratio of almost 80%. On the other hand, businesses of cluster (2, 10) have a debt ratio slightly above 62%. Yes, the high level of indebtedness is due to the nature of the business activities. Nevertheless, we can say that companies of cluster (2,10) are closer to the optimum (even with a larger sum of money than companies of cluster (1, 10)). Even in this case, it is certainly interesting to explore the relationship of some variables. Figure 6 shows the relationship between total assets and operating income. The correlation coefficient of the two variables is 0.15. Also, we can´t talk about a dependency between the two variables. However, in this cluster, we can observe the trend in the fact that a higher volume of assets implies a higher operating income. The trend-line has a 6th order polynomial again. Figure 7 illustrates the relationship between fixed assets and operating income. For the relationship between fixed assets and operating income, the correlation coefficient is 0.21. While not a statistically significant dependency, we can already say that the higher volume of fixed assets has a positive effect on the generation of operating income. Even in this case, the correlation coefficient is not high, namely 0.18. However, even in this case, we can infer that a higher debt ratio does not mean a higher operating income. As in the case of cluster (1,10), it is advisable to reduce the volume of external capital involved.

Conclusion
The aim of the paper was to perform a cluster analysis of businesses operating in trade. The aim of the paper was fulfilled.
Cluster analysis was performed using neural networks without a teacher -Kohonen networks. Based on cluster analysis, companies were divided into clusters in the Kohonen map (10 x 10 clusters). Some clusters are significant in terms of number of businesses. The most represented are clusters (1,10) and (2,10). These clusters have therefore been subjected to further analysis. In the case of both clusters, the typical representatives of these clusters were defined. These varied by their size and volume of operating income generated. The companies of both clusters were on the borderline of profit margins in 2016. The average operating income for one company of cluster (1, 10) was approximately 1.3 thousand CZK, in the case of cluster (2, 10) it was 18 thousand. CZK. The companies of the two clusters differ far more in the amount of the operating profit. Companies of cluster (1, 10) make a profit of 101 thousand CZK. Companies of cluster (2, 10) reach 850 thousand CZK in profit. Businesses of both clusters are highly indebted. This can be partly related to the field in which they operate and partly to the non-rational decision about the involvement of external debt. In the case of debt, companies of cluster (2, 10) are better off, with external capital accounting for 61% of all capital. On the other hand, the indebtedness of companies in (1, 10) is almost 80%. According to the volume of fixed assets, it is possible to conclude, in particular, for companies of cluster (1, 10) that they do not operate in their own premises, but in rented premises or on the Internet. From the analysis we can make two very specific conclusions: 1. A larger company (or a company with more assets) generates, on average, a higher operating income.