Stable development of travel business in the Odessa region: E stimate using multivariate statistical analysis

. The article examines the level of sustainable development of the travel business in the Odessa region of Ukraine. This level is evaluated as a latent indicator, which is characterized by a certain set of signs-symptoms. The methodology used for modeling is Partial Least Squares-Path Modelling (or PLS-PM) and the method Time-Wise Multi-Way Principal Component Analysis. This approach allowed us to identify and evaluate latent and explicit factors affecting the sustainable development of travel business in the Odessa region.


Introduction
In recent years, there has been a reorientation of the main goals of the business. Previously, the main task of business was to obtain the greatest possible profit, now stable development is taking priority. A lot of research has been devoted to the issues of sustainable development of the travel business, among which we mention works [1][2][3][4][5][6][7][8]. The problem of quantitative assessment of the level of stable development of the tourism business is especially urgent. To solve it, it is necessary to apply the methods of mathematical modeling. A number of works are devoted to these questions [9][10][11][12][13][14][15][16][17]. Many economic indicators (such as development stability, competitiveness, investment attractiveness, efficiency, etc.) are latent, clearly unmeasured. Therefore, it becomes necessary to use special methods for their evaluation. This work is devoted to assessing the level of sustainable development of the tourism business in the Odessa region using special methods of multivariate statistical analysis. It should be noted that the Odessa region has strong potential for sustainable development. Among the many factors influencing this, we note the following: the presence of an extended line of the Black Sea coast, the presence of a developed infrastructure (sea and river ports, an international airport), sanatoriums, rest houses, cultural attractions, etc. Unfortunately, in 2020, the tourism business in the Odessa region (due to  practically ceased to function normally (like all over the world).

Materials and methods
Data for analysis is available on the website of the State Statistics Service of Ukraine (regional statistics) and Main Department of Statistics in Odessa region [18,19]. The time period for which the data will be available starting from 2010 and ending in 2019. Preliminary data processing was carried out in MS Excel spreadsheets. When modelling and computing was used the following tools: a package for component analysis (PCA-PM) of Statistica12 [8], and program XLSTAT (a package XLSTAT-PLSPM for analysis using the least partial square method PLS-PM: Partial Least Squares Path Modelling or Projection on the Latent Structures Path Modelling). PLS-PM methods started to be widely used in a variety of applied researches of the 70-s years of the twentieth century because of the work of Herman Wold and his son Sven Wold, in this works the basic principles of the technique of modelling PLS-PM were established. PLS-PM is a tool for modelling of interconnections between latent variables. The method PLS-PM is used for analyses of high dimensional data in the conditions of ill-structured environment. It is widely applied, in particular, in the economy to evaluate such latent indexes as a utility, level of economic development, etc. Following, the task of modelling by the method of PLS-PM can be described as follows: let X is a n p  data matrix: 11 3. Random deviation are independent of the explanatory variables, that is, the following requirement is satisfied: cov( , ) 0 Thus, at the simulation by using the method of projection on the latent structure the requirements for the statistical distributions of variables and random deviations are absent. The external model describes two type of interconnection between latent and explicit variables: reflective and formative. Reflective type is the most widespread type of external models in which the latent variable is the cause of implicit variables, that is, explicit variables reflect latent. The external model can be written in the form of the system of linear equations:

Results and Discussion
Building PLS-PM for the model of sustainable development of travel business for Odessa region. The following table presents indicators of sustainable travel business development: x Source: Own processing.
Note that these indicators of sustainable development of the tourism business were selected from a large number of factors affecting the stable development of the region.  The relationship of the latent variables is the internal model, and the relationships between latent and explicit variables represent an external model. Let's write these models in analytical form. The internal model: j LV -latent variables; 2 3 4 , ,    -structural coefficients, which characterize the strength and direction of the interconnection between latent variables, 0  -free member and 1  is residual.
The external model looks like the following system: x LV x LV   Source: Own processing with Statistica. Figure 2 and Table 2 show that the components-factors LV1, LV2, LV3, LV4 selected for modeling provide more than 90% of the total variation. Step a). The following criteria are used to verify internal consistency in blocks: Cronbachalpha ( ) and Dillon-Goldstein (  ) coefficients (see, for example, [20][21][22] Table 3. Checking internal consistency in blocks Source: Own processing with XLSTAT-PM. The Table 3 shows that the two blocks (LV4 and LV1 ) have good 0, 7   and 0, 7   coefficients. But the LV3 and LV2 blocks have poor internal consistency ( , 0, 7    ). The following Table 4 shows the correlation coefficients of explicit and latent variables for each of the blocks. In the block LV3 two variables (x31 and x 32) have a negative correlation with latent variable. In the LV2 also there is a variable with negative correlation (x23). This leads to internal inconsistencies in blocks, since these indicators are so-called de-stimulators (the higher their values, the worse). For the correct application of multivariate statistical analysis methods, it is recommended to make all indicators unidirectional (as a rule, stimulants). Therefore, we will make a replacement: x31 / =1-x31 , x32 / =1-x32 , x23 / =1-x23 . After modification of variables we obtain the index of cooperation of external model, are presented in the following  Source: Own processing with XLSTAT-PM.
Step b). The second phase is verification of the external model:  Source: Own processing with XLSTAT-PM.
After checking the load factors 1 j  , that is, after checking the interconnection of explicit variables with latent variable of the corresponding block, you need to check the crossloading, i.e. to determine the strength of the connection between explicit variables and latent variables of other blocks. This will exclude such indicators whose connection with latent variable of another block is greater than with latent variable of the corresponding block. The T able 8 shows that the strength of the connection of all explicit variables with latent variables of corresponding blocks is greater than with latent variables of other blocks, that is, all variables are "loyal" to their blocks. The next step is to check the quality of the internal model. In the Table 9 are estimates i  of the equations of structural model, as well as the results of the criterion-statistics. R for target block LV1 more than 87%. The proportion of variation characterizes share variations of a block, which reproduced latent variable this block. This characteristic for the all blocks far exceeds 50%, that positively characterizes the model. In the last column of a given indicator, which characterizes the dispersion excluded share (average particle dispersion indicators unit that explains the latent variable dispersion in the block that contains the error of measurement). Fashion AVE for the all blocks exceeds 50%, therefore, this criterion of the internal model is also considered satisfactory. The fifth stage is the calculation of the single factor of quality conformity model data -GoF (Goodness-of-Fit [21]). Coefficient characterizes the quality as an internal model of the system, and external, and serves as an indicator of the reliability of the forecast model (predictive reliability model is considered to be high, if the coefficient of GoF > 70%). Our model coefficient GoF=82%. Therefore, the constructed model can be used for forecasting. Finally, let's analyze the final version of our model. The internal model can be written as the following equation: Estimates of latent variables are a system of equations:

Conclusion
Two factors have the most significant direct impact on the level of stable development of the tourism business in the Odessa region (LV1): LV2 -level of social condition (comfort) and LV3 -level of the environmental protection (ecological state of the region). The strength of their influence is 0,5143 and 0,6735 respectively. A weaker direct effect (with a strength of 0,2903) on the endogenous factor LV1 is exerted by the exogenous factor LV4 -condition of economics. Obviously, its influence is direct (through factors LV2 and LV3). According to the results obtained, the level of social comfort ( LV2) can be assessed using three factors: 21 x -total subsidies for the population; 22 x -the sum of state cost on the protection of public safety, health and other social services; 23 x -the proportion of the total number of unemployed persons to the economically active population (de-stimulating effect). The level of the environmental protection (LV3) can be assessed using 4 factors: 31 x -emissions of pollutants into atmospheric air and 32 x -wastes generation (de-stimulating effect); 33 x -state cost on the protection of the environment and 34 x -current expenditures on environmental.
On improving the state of the economy (factor LV4) is significantly influenced by 42 x -state spending on economic development. The level of stable development of tourism business (LV1) determined by the direct impact of three factors: 11 x -the number of tourists; 12 x -wage of employees in the service sector; 14 x -tax revenues from the travel industry. On the model you can count latent indexes for each year and create a stimulant to predict future values. Sustainability of development can be assessed by comparing the values of estimates for different periods. For example, if the estimation of the latent variables (LV2, LV3, LV4) current period previous period ratings are smaller, then the corresponding block is considered to be sustainable. Or for every variables you can define a baseline with which to conduct the comparison.