Comparison of exponential time series alignment and time series alignment using artificial neural networks by example of prediction of future development of stock prices of a specific company

Accurate stock price prediction is very difficult in today's economy. Accurate prediction plays an important role in helping investors improve return on equity. As a result, a number of new approaches and technologies have logically evolved in recent years to predict stock prices. One is also the method of artificial neural networks, which have many advantages over conventional methods. The aim of this paper is to compare a method of exponential time series alignment and time series alignment using artificial neural networks as tools for predicting future stock price developments on the example of the company Unipetrol. Time series alignment is performed using artificial neural networks, exponential alignment of time series, and then a comparison of time series of predictions of future stock price trends predicted using the most successful neural network and price prediction calculated by exponential time series alignment is performed. Predictions for 62 business days were obtained. The realistic picture of further possible development is surprisingly given based on the exponential alignment of time series.


Introduction
Time series are defined as sequences of spatially and de facto comparable observations which are organized based on time [1]. According to De Baets and Harvey [2], time series can be defined as ordered sequences of values of a variable at equidistant time intervals. According to the authors, time series analysis can be applied, for example, in process and quality control, in sales predictions, in census analysis, etc.
Currently, the application of time series is used in many different disciplines. Significant uses of time series include financial time or economic series that are very specific. Economic time series are often volatile, especially because over time, economic operators face different a specific time series is based on the validation of prediction on the retained portion of a sample using criteria such as a mean absolute percentage error [21].
The aim of the paper is to compare the method of exponential alignment of time series and time series alignment using artificial neural networks as tools for predicting future price development of the company on the example of Unipetrol.

Data and methods
Unipetrol is one of the most prominent business entities in the Czech Republic. Its activity, structure, vision and activity is characterized by the following [22]: "Unipetrol is the most important refining and petrochemical group in the Czech Republic and one of the main players in Central Europe. In the Czech Republic, it is the largest oil processor, one of the most important plastic producers and the owner of the broadest petrol station network under the Benzina brand. Revenues in the group in 2016 amounted to 87.8 billion CZK. The Unipetrol Group has been part of the largest Central European refining and petrochemical group PKN Orlen, based in Poland, since 2005. The group is divided into two business segments: downstream (combining refinery and petro-chemistry) and retail fuel distribution. Within the downstream segment, the company operates refineries in Litvínov and Kralupy nad Vltavou. The group is number one on the Czech wholesale fuel market. In its Litvínov locality, the group operates an ethylene unit with subsequent polymer production. In 2016, Benzina's petrol station network with 363 petrol stations (as of 31 December 2016) and the estimated market share of 17.6% of the retail fuel sales was the leader in the Czech market (company estimate according to the Czech Statistical Office by October 2016). Within the subsidiaries PETROTRANS, s.r.o. and UNIPETROL DOPRAVA, s.r.o. the group operates a wide range of logistics and transportation services. Unipetrol R & D Center, a.s. in Ústí nad Labem and the Polymer Institute Brno branch are engaged in research and development in the field of petro-chemistry. As of 31 December 2016, the group employed more than 4,500 workers from various professions. The parent company of the group is UNIPETROL, a.s. " Data on stock prices is available from January 2, 2006 to April 30, 2018, with a total of 3,076 entries. The data comes from the Prague Stock Exchange. These are the final prices of each day on which the stocks of the company were traded during that period. Data statistics are listed in Table 1. Source: Authors.

Time series alignment using artificial neural networks
For data processing, DELL's Statistica version 12 will be used. The data mining tool of the neural network will be used. In particular, we will use time series (by regression). We will generate Multilayer Perceptron Networks (MLP) and Neural Networks of Basic Radial Functions (RBF). Time will be the independent variable. We will determine the company's stock price as the dependent variable. We divide the time series into three setstraining, testing, and validation. The first group will be 70% of input data. Based on the training set of data, we generate neural structures. In the remaining two sets of data, we always leave 15% of the input information. Both groups will serve us to verify the reliability of the found neural structure or found model. The delay of the time series will be 1. We will generate 10,000 neural networks. Out of those we will preserve 5 with the best characteristics † . In the hidden layer, we will have at least two neurons, at most 20. In the case of the radial core function, at least 21 neurons, at most 30, will be in the hidden layer. For the multilayer perceptron network, we will consider these activation functions in the hidden layer and in the output layer:  Linear,  Logistic,  Atanh,  Exponential,  Sinus. Other settings are left by default (ANSautomated neural network). The results, if the outputs are not adequate, can then be corrected by adjusting the weights of individual neurons in the structure using the VNS tool (own neural networks).
As soon as we generate neural networks, we will evaluate their validity on an expert basis, not just by statistical characteristics. Because of neural network shortcomings (black box, † We will proceed using the smallest square method. We will terminate network generation if there is no improvement, ie the reduction of the sum of squares value. Thus, we will preserve those neural structures whose sum of squares of residues to actual gold development will be as low as possible (ideally zero). neural network overflow, etc.), neural networks may have excellent statistical parameters but won't be usable for realistic predictions. We will confront the predictions of the most successful networks between each other. Price developments will be predicted for the next 62 days on which the stocks will be traded. However, it should be noted that the prediction of the course of the price development is likely to be inaccurate. We will expect greater precision when predicting the price for the next one day after the input data set.

Exponential alignment of time series
DELL's Statistica software will also be used for processing. Specifically, it will be a module of advanced models. In them, we select time series / prediction. We then choose Exponential alignment and Prediction. The analysis settings will be as follows:-1. We will use the model with exponential, 2. A seasonal component will not be established, 3. Alpha and Gamma will be both set to 0.1, 4. We will predict for the next 62 business days, 5. Set the maximum number of iterations to 50, 6. The convergence criterium will be 0.0001, 7. The compliance rate indicator will be the mean average square of the error, 8. Keep the other settings as default.

Comparison of time series
Finally, we compare the time series, i.e. the prediction of future stock price developments predicted by the most successful neural network and the prediction of prices calculated by exponential time series alignment. Table 2 provides 5 neural networks with the best characteristics out of 10,000 generated structures. Sum of sq.
The first neural network differs from others. It is a neural network of the basic radial function. The RBFT was used as a training algorithm. As with other preserved neural networks, the error function was determined by the sum of the smallest squares. For the activation of neurons in a hidden layer, Gauss function is used, for the activation of the output layer of neurons, identity is used. Other preserved neural networks are multilayer perceptron neural networks (MLPs). All were created using the Quasi-Newton algorithm (always using a different version). To activate the hidden layer of neurons, they use two functionsnamely hyperbolic tangent and logistic functions. To activate neurons the outer layer uses a single functionlogistic. The performance of the neural network is described by the value of the correlation coefficient. Correlation coefficients of all preserved networks and all data sets are shown in Table 3. We are ideally searching for such a neural network, which has a correlation coefficient ideally the closest to 1. The performance of all three sets should be ideally similar. This means that the structure created by the training set of data is valid and validated on the other two data sets. Of course, we must not forget that the neural network should have minimal errors in all three sets of data. The value of the correlation coefficients of all neuronal structures and data sets is always higher than 0.986. Differences between individual neural networks are minimal. This will be important in analysis of prediction statistics (see Table 4 below). Source: Authors.
If we monitor the statistics of the predictions of individual neural networks, we necessarily conclude that the differences between the networks are absolutely minimal for all statistics.
The graphical development of prices and predictions can also tell us the right outcome. Figure 1 provides a graphical comparison of Unipetrol's actual stock prices and predictions calculated using all neural networks. It can be seen from the picture that all networks have more or less accurately copied the real price movements in past data. However, at some moments, for example on the 100th trading day, the stock price fluctuated downwards. These fluctuations, but in both directions, are repeated several times. It is a question of how far they are the result of a turbulent environment that none of the preserved neural networks has been able to reliably describe. However, despite this fact, we can assume that all preserved neural networks are applicable in practice.
After network training, prediction was performed over the next 62 business days. Figure  2 gives you a close look at these 62 business days. It is clear from the graph that networks are always (or almost always) predicting a constant development of Unipetrol's stock price over the review period. The only network that does not predict the constant development of stock prices over the projection period is Network Number 5, MLP 1-17-1. However, the price differences between days vary at the level of one thousand CZK. There is a significant difference between the predictions of individual networks. At first glance, and in view of price trends so far, the constant price seems unlikely to be in the observed period of 62 days. Table 5 provides a detailed view of selected prediction cases. Specifically, this is about every tenth case of prediction. The table shows that the 2nd and 3rd neural networks have exactly the same predictions. This is always the minimum amount of predictions at a given time. Other predictions are higher. The table below also offers a summary of the maximum and minimum predictions of the case. It also calculates the difference between the minimum and maximum predictions. Subsequently, this difference is compared to the maximum predicted and expressed as a percentage of the difference to the maximum prediction. We find that the difference between the minimum and maximum predictions is 72.6%. Therefore, it is clear that the interval between the lowest and the highest predictions is too large for the future estimate. This means that the actual price may move at this interval, but the value of the prediction is zero for the investor. However, if we expertly assess predictions for the next day, we would choose the network number 5, i.e. MLP 1-17-1 as the most successful. On the imaginary second place, we would put network number 4, MLP 1-20-1.  It may seem optically that the course of both curves is exactly the same. Perhaps the observer will notice minor differences. Even more interesting will be to enrich the comparison by adding residues which will precisely define the differences between the two curves. Such a comparison is the subject of Figure 4. In essence, this is a graph that matches the figure 3. However, this is a box graph that provides the second variable for comparison. In addition, residues of an exponentially aligned time curve are displayed. The colour of the exponentially aligned curve, which is now red, has changed as opposed to the graph in figure 3. We can now follow the curve of residues, which are presented in green colour. The x-axis for the residue graph is on the right side of the image. The residual value ranges from -32 to 22 CZK per stock. In relative terms, we range from approximately -15% to the average stock price to 10%. Because these are extreme values, we can boldly state that the model appears at first glance as very interesting.

Comparison of methods
In order to better compare the use of both methods, i.e. artificial neural networks and exponential time series alignment, we will also use the graph (see Figure 5). For comparison, we used the neural network number 4 and 5, in particular MLP 1-20-1 and MLP 1-17-1, in addition to prediction using exponential time series alignment. Both neural networks predict constant or near-price development. However, this may seem unlikely, as we predict the price for 62 business days (i.e. for more than 2 months). Looking at Unipetrol's stock price so far, we find that there is no analogy in the past where the price of stocks remains constant. Logically, the development predicted by the exponential alignment of time series appears more suitable. Over 62 business days, the company's stock price is expected to grow from approximately 371 CZK to 387 CZK. The change in the 16 CZK is realistic (in relative terms it is a change of less than 3.7% of the price).

Conclusion
The aim of the paper was to compare the method of exponential alignment of time series and time series alignment using artificial neural networks as tools for predicting future price development of a company, as exemplified on Unipetrol.
First, the data file was analysed. Subsequently, neural networks were generated, of which five with the best characteristics were preserved. Using statistically interpreted results, it has been found that it is possible to use up to two neural networks in practice: 4 (MLP 1-20-1) and 5 (MLP 1-17-1).
Then, the exponential alignment of time series was used. A prediction was obtained for 62 business days, which was then compared to the results of two preserved neural networks. At the time of the calculations it was not possible to determine which prediction is or is not correct. Based on a simple argumentation, we have inclined to the fact that, a realistic picture of further development is given rather by a prediction based on an exponential alignment of time series.
Generally, experts have found that artificial neural networks show better results than conventional statistical methods. In this case, however, the result was different. But the question remains why. We can believe that this has happened due to the possible turbulent development of Unipetrol's stock prices or by using an inappropriate software tool -Statistica for this task. Or we can conclude that the prediction for 62 business days is too long. It will therefore be appropriate to devote more effort to this research. We will test other prediction