Cluster analysis of the economic activity of Slovak companies regarding potential indicators of earnings management

Research background: All over the world, any information about the earnings manipulation is very important for all the stakeholders of the companies. Therefore, it is necessary to detect this situation in a certain way. The global practice has shown that it is appropriate to create detection models and it would be very useful to specify individual sectors or the groups of sectors of economic activities of companies. Purpose of the article: The article aims to the financial ratios of Slovak companies that are globally used in the detection of earnings management. Based on hierarchical cluster analysis we identify groups of economic activities (according to the international NACE classification) with similar financial characteristics. Methods: For efficient earnings manipulation detection, high-quality and up-to-date financial data is required. We used financial data of real Slovak companies from the year 2018 obtained from international database Amadeus. After a precise pre-preparation of the dataset, we use the standard clustering procedures. Using the analysis of the dendrogram, the groups of the companies with their economic activities are identified. Findings & Value added: The results of the analysis show that there exist logical groups of NACE categories of economic activity of companies with similar characteristics. Regarding potential earnings manipulation, companies in these groups are as similar as possible. Therefore, financial characteristics can be analyzed together, and more accurate detection models could be created for them.


Introduction
Earnings management is a current topic in the financial management research area. However, in Slovakia, this topic is not yet very widespread, and this research area has not been elaborated in more detail yet. Nevertheless, it is a phenomenon of the modern approach to the reporting of accounting information and the related accounting decisions of managers that may affect the overall results of financial statements. [1,2] Many companies use earnings management as a tool to maintain stable profit growth or prevent "red numbers" in financial statements. Understanding the issues of earnings management and the reasons for its implementation is essential for users of the financial statements.
For detecting earnings management in companies, it has so far been developed several models that tend to be used for this purpose in practice. Well-known models that are most used in practice include, for example, the Jones model [3] and its various modifications, or the Beneish model [4]. These models use different approaches and are created by different statistical tools, but their common feature is that they were on their basis created using corporate accounting data. In creating such a model for identifying earnings management in a company, one of the important facts may be the sector of the economic activity, in which the company operates. [5,6] Therefore, in this article, we have focused on the sectors of economic activity, given by the NACE classification of Slovak companies. This article aims to find such groups of NACE activities in which enterprises are so similar that it is appropriate to evaluate them together and in the case of creating a model for identifying earnings management it is possible to create a common model for these groups. This is done by application of the cluster analysis. The contribution of this study is therefore to find segments of economic activities of companies, which eliminates the potential need to create models for individual sectors. Moreover, this article contributes to this area also because there are so far studies dealing with the issue of earnings management in Slovakia and, therefore, we consider the cluster analysis of Slovak companies to be innovative.
The rest of the paper is organized as follows. The Literature review section highlights the studies of several authors, aimed at the problematic of earnings management and its detection. The second part is the Methods, briefly describing the theoretical basis of the cluster analysis used in this study. In the third section, the results of the clustering of economic activities of Slovak companies are listed. Discussion section shows the possible future direction and continuation of this study.

Review
The topicality of the topic of earnings management is visible also with regard to the studies of many authors, currently published. They address this issue from many perspectives. In the study [7], the authors examine the influence of positive and negative discretionary accrual management on the stock price. They found that discretionary accrual management overestimates the company, causing its stock prices to rise. The study [8] mentions the topic of earnings management in connection with the analysis of the relationship between sustainability and reliability in banking. In the study [9], the authors deal with the influence of the result of earnings management from the point of view of the interest of shareholders towards the interests of the company's managers and also the methods of earnings management on the credibility of the company for shareholders. In the study [10], the authors examine the impact and occurrence of manipulation activities in the area of pension management and the development of pension plans. The study [11] focuses on examining the relationship between environmental (carbon) performance and earnings management in the European capital market. In the study [12] the authors focused on SHS Web of Conferences 92, 0 (2021) Globalization and its Socio-Economic Consequences 2020 Several authors deal with the issue of earnings management also in Slovakia. There are not many of these studies, this issue is dealt by only a few authors, for example [13][14][15][16].

Methods
The main goal of the cluster analysis is to find and identify homogeneous subgroups in a sample. It is an exploratory analytical tool that aims to classify different objects into so-called clusters with a maximum degree of association between objects in the same group and a minimum degree of association between objects in different groups. It is used to reveal the structure of data without the need to interpret its existence. [17] The set of clustering methods is based on the determination of the similarity of the objects to be assigned to one cluster. The similarity is a measure of the extent to which the objects can be considered the same. It is most frequently used with similar degrees of similarity with a measure based on distance.
From the commonly used distances, the Squared Euclidean distance is used in this paper. It is defined by the relation where ‫ݔ‬ is the value of the ݇-th variable for the ݅-th object and ‫ݔ‬ is the value of the ݇-th variable for the ݆-th object and ‫ܭ‬ is the number of variables used for clustering. The values of this distance for all object pairs form a similarity matrix, the calculation of which is the first step of the cluster analysis. [18] The next step is the formation of clusters. The selection of procedures for their creation determines how clusters are formed and how the similarity of objects in the cluster is calculated. The methods used to cluster objects are most often divided into the hierarchical and non-hierarchical, according to how the system forms the resulting clusters. The hierarchical procedures used in this paper provide an overview of the complete structure of the system of input objects, for example through its graphical representation by the so-called dendrogram. Another advantage of a hierarchical procedure is the fact that it is not necessary to know the resulting number of clusters in advance. [19] Hierarchical clustering is mostly agglomerated. In this case, the procedure starts with each object considered as one cluster, then two clusters with the smallest distance (with the greatest similarity) are connected to one cluster. Subsequently, an object with the smallest distance from these two is joined to this cluster, or if it has a smaller distance from the fourth object, they create together a new cluster. This process is then repeated. Objects are either added to existing clusters, new clusters are created, or existing clusters are combined. In the last step of the hierarchical procedure, all objects form a single cluster. [20] In this paper, the Ward method to form the clusters is used. This method is different from other clustering methods because it uses Analysis of variance to form clusters. The principle is the minimal growth of the within-group variability by adding a new object to the cluster. This method leads to the formation of clusters of approximately the same shape and size. Also, there exists the tendency of the method to remove small clusters.
For identifying the groups (clusters) of economic activities of the companies (according to the international NACE classification) with similar financial characteristics, a hierarchical cluster analysis is used in this study. For this study, we used the roughest categorization of economic activities with respect to the entire NACE sections presented in Table 1.

Results
We applied the aforementioned hierarchical procedure for clustering the NACE categories. We used a dataset consisting of empirical data from the year 2018 calculated using the financial statements of 13,369 real Slovak companies. Hierarchical clustering of NACE categories was performed based on selected descriptive characteristics calculated for individual NACE categories. Specifically, we used mean values, medians, standard deviations, and minimum and maximum values of 19 financial variables (see Table 2). The SHS Web of Conferences 92, 0 (2021) Globalization and its Socio-Economic Consequences 2020 square Euclidean distance defined by equation (1) was used as a measure of similarity. We used Ward's method in the clustering process.

Fig. 1. Dendrogram.
All steps of the analysis were performed using IBM SPSS Statistics. The final result is described by the dendrogram shown in Figure 1. It presents gradual clusters of NACE categories from individual sections (individual clusters) to the inclusion of all sections in a single cluster. The dendrogram shows that it is possible to identify two or (better) three clusters of NACE categories. The classification of the categories into three clusters is as follows. x

Discussion and conclusions
As we outlined in this article, the issue of earnings management in the company is currently very topical. To identify it, models based generally on the financial data of companies are created. These models, or in general the issue of earnings management, is often analysed in various sectors of economic activity. Very often, for example, studies focus on earnings management in banks, for example [21] focused on African banks, [22] on US banks, [23] analysed Iranian banks and [24] Islamic banks. Other areas of application include, for example, insurance, which have been addressed by [25]. We would find many more such areas, but it is not clear whether the results of these studies can be generalized to other sectors. Therefore, the analysis of these sectors and characteristics of companies operating in them in Slovakia using the application of cluster analysis is beneficial and the results can be further used in the analysis of earnings management issues in companies, or finally, in the creation of models for its identification. The derivation of the model of identification of earnings management in companies in Slovakia represents a possible further direction and use of the results of this study.